Search

Yuxiong He

Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation
Deepspeed data efficiency: Improving deep learning model quality and training efficiency via efficient data sampling and routing
A Comprehensive Study on Post-Training Quantization for Large Language Models
Understanding INT4 Quantization for Transformer Models: Latency Speedup, Composability, and Failure Cases
DySR: Adaptive Super-Resolution via Algorithm and System Co-design
Random-LTD: Random and Layerwise Token Dropping Brings Efficient Training for Large-scale Transformers
Deepspeed inference: Enabling efficient inference of transformer models at unprecedented scale

Published with Wowchemy — the free, open source website builder that empowers creators.