Cheng Li
Cheng Li
Home
Experience
Publications
Talks
Languages
Contact
Reza Yazdani Aminabadi
Latest
Understanding INT4 Quantization for Transformer Models: Latency Speedup, Composability, and Failure Cases
Deepspeed inference: Enabling efficient inference of transformer models at unprecedented scale
Cite
×