SVDq: 1.25-bit and 410x Key Cache Compression for LLM Attention

January 2025

Type

Publication

arXiv preprint arXiv:2502.15304

Add the full text or supplementary notes for the publication here using Markdown formatting.