Cheng Li

Cheng Li

Member of Technical Staff

Black Forest Labs

About Me

I am a Member of Technical Staff at Black Forest Labs, specializing in optimizing the training and inference efficiency of Large Language Models (LLMs) and Large Vision Models (LVMs).

Previously, I worked at Databricks Mosaic AI, where I played a key role in developing the DBRX model by optimizing memory utilization, computational efficiency, and communication strategies during training to achieve state-of-the-art performance (Building DBRX-class Custom LLMs with Mosaic AI Training). I collaborated with NVIDIA to resolve FP8 training challenges in TransformerEngine, enabling FP8 training for Mosaic AI models. Additionally, I led the technical effort to optimize the inference of Llama and DBRX models.

Prior to Databricks, I was part of Microsoft DeepSpeed, where I enhanced the performance and usability of LLMs in production systems such as GitHub Copilot and DALL·E2. My work included developing cutting-edge AI system technologies and scaling Microsoft DeepSpeed into a leading AI framework.

I created llm-analysis, an open-source tool for analyzing latency and memory in transformer models, helping with resource planning and optimization. Check it out!

Interests

  • Large Language Models
  • System Optimization and Engineering for Deep Learning
  • GPU and Parallel Computing

Education

  • PhD in Computer Science, 2020

    University of Illinois Urbana-Champaign

  • MS in Computer Science and Engineering, 2015

    University of Michigan

  • BS in Computer Engineering, 2013

    University of Michigan

  • BS in Electrical Engineering, 2013

    Shanghai Jiao Tong University

Experience

 
 
 
 
 

Member of Technical Staff

Black Forest Labs

Nov 2024 – Present Bellevue, WA
 
 
 
 
 

Software Engineer

Databricks

Aug 2023 – Nov 2024 Bellevue, WA
 
 
 
 
 

Researcher

Microsoft

Aug 2020 – Aug 2023 Bellevue, WA
 
 
 
 
 

Research Intern

Alibaba Group

May 2019 – Aug 2019 Sunnyvale, CA
 
 
 
 
 

Teaching Assistant for the 9th Programming and Tuning Massively Parallel Systems + Artificial Intelligence summer school (PUMPS+AI)

BSC, UPC and UIUC

Jul 2018 – Jul 2018 Barcelona, Spain
 
 
 
 
 

Research Intern

IBM TJ Watson Research Center

May 2018 – Aug 2018 Yorktown Heights, NY
 
 
 
 
 

Research Intern

IBM TJ Watson Research Center

May 2017 – Aug 2017 Yorktown Heights, NY
 
 
 
 
 

Head Teaching Assistant for ECE408/CS483: Applied Parallel Programming

UIUC

Aug 2016 – Dec 2016 Champaign, IL

Publications

Quickly discover relevant content by filtering publications.

Languages

Chinese, English

Contact