KV Estimator
Pure functions for computing KV cache memory requirements. No side effects, no runtime dependencies.kvBytesPerToken(arch, precision?)
Returns the number of bytes required to store one token’s KV cache entry.
2 × layers × kvHeads × headDim × precisionBytes
The 2× accounts for both K and V tensors.