- formatting
- images
- links
- math
- code
- blockquotes
•
•
•
•
•
-
My Solutions to "How to scale your model"
My Solutions to "How to scale your model"
-
30 Interview Questions on LLM Inference
How many FLOPs is attention? When does a KV cache hurt performance? Why does beam search cause your KV cache memory to explode?
-
Can You Solve Dynamic Programming Problems Faster? Part 1, One-Dimensional LWS
We prove you can solve dynamic programming problems polynomially faster if you have a simple cost function for kD LWS problems.