SVDq: 1.25-bit and 410x Key Cache Compression for LLM Attention

Publication
arXiv preprint arXiv:2502.15304

Add the full text or supplementary notes for the publication here using Markdown formatting.