Search

Huawei AI-Solver Group

Huawei AI-Solver Group

新闻
研究论文
成员
联系

Sinno Jialin Pan

Latest

CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM Inference
KVTuner: Sensitivity-Aware Layer-wise Mixed Precision KV Cache Quantization for Efficient and Nearly Lossless LLM Inference
PreMoe: Lightening MoEs on Constrained Memory by Expert Pruning and Retrieval
FuseGPT: Learnable Layers Fusion of Generative Pre-trained Transformers

© 2025 Me. This work is licensed under CC BY NC ND 4.0

Published with Hugo Blox Builder — the free, open source website builder that empowers creators.

Cite