Deepspeed inference: Enabling efficient inference of transformer models at unprecedented scaleReza Yazdani Aminabadi,
Samyam Rajbhandari,
Minjia Zhang,
Ammar Ahmad Awan,
Cheng Li,
Du Li,
Elton Zheng,
Jeff Rasley,
Shaden Smith,
Olatunji Ruwase,
Yuxiong He
Sirius: An Open End-to-End Voice and Vision Personal Assistant and Its Implications for Future Warehouse Scale ComputersJohann Hauswald,
Michael A. Laurenzano,
Yunqi Zhang,
Cheng Li,
Austin Rovinski,
Arjun Khurana,
Ronald G. Dreslinski,
Trevor Mudge,
Vinicius Petrucci1,
Lingjia Tang,
Jason Mars