Cheng Li
Cheng Li
Home
Experience
Publications
Talks
Languages
Contact
Yuxiong He
Latest
ZeroQuant-V2: Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation
A Comprehensive Study on Post-Training Quantization for Large Language Models
Understanding INT4 Quantization for Transformer Models: Latency Speedup, Composability, and Failure Cases
DySR: Adaptive Super-Resolution via Algorithm and System Co-design
Random-LTD: Random and Layerwise Token Dropping Brings Efficient Training for Large-scale Transformers
Deepspeed inference: Enabling efficient inference of transformer models at unprecedented scale
Cite
×