Huawei AI-Solver Group
Huawei AI-Solver Group
新闻
研究论文
成员
联系
SVDq: 1.25-bit and 410x Key Cache Compression for LLM Attention
Hong Yankun
,
Li Xing
,
Zhen Hui-Ling
,
Yu Xianzhi
,
Liu Wulong
,
Yuan Mingxuan
January 2025
Cite
Type
Journal article
Publication
arXiv preprint arXiv:2502.15304
Add the
full text
or
supplementary notes
for the publication here using Markdown formatting.
Cite
×