Search

Xiaoxia Wu

Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation
Deepspeed data efficiency: Improving deep learning model quality and training efficiency via efficient data sampling and routing
A Comprehensive Study on Post-Training Quantization for Large Language Models
Understanding INT4 Quantization for Transformer Models: Latency Speedup, Composability, and Failure Cases
Random-LTD: Random and Layerwise Token Dropping Brings Efficient Training for Large-scale Transformers

Published with Wowchemy — the free, open source website builder that empowers creators.