Cheng Li

Cheng Li

Senior Software Engineer

Databricks GenAI

About Me

I am a senior software engineer at Databricks GenAI. My work has focused on optimizing training/inference of Deep Learning (DL) models, particularly on Large Language models (LLMs) and Large Multimodal Models (LMMs).

At Databricks, I have worked on building DBRX and optimizing its training performance (three months of training on 3072 H100 GPUs). I have aggressively optimized the memory usage/computation/communication to achieve SOTA training efficiency. Refer to Building DBRX-class Custom LLMs with Mosaic AI Training for more details. Currently I am optimizing Llama3 and DBRX inference performance.

Before joining Databricks, I was a senior researcher at Microsoft, where I worked on improving LLM/LMM performance/usability in production (GitHub Copilot, DALL·E2, etc.), creating SOTA AI system technologies and building up Microsoft DeepSpeed, an open-source library that enables unprecedented scale and speed for training and inference.

I developed and open sourced llm-analysis: Latency and Memory Analysis of Transformer Models for Training and Inference. It helps planning resources for training/inference and suggests optimization opportunities. Check it out!

Interests

  • Large Language Models and Multimodal Models
  • System Optimization and Engineering for Deep Learning
  • GPU and Parallel Computing

Education

  • PhD in Computer Science, 2020

    University of Illinois Urbana-Champaign

  • MS in Computer Science and Engineering, 2015

    University of Michigan

  • BS in Computer Engineering, 2013

    University of Michigan

  • BS in Electrical Engineering, 2013

    Shanghai Jiao Tong University

Experience

 
 
 
 
 

Senior Software Engineer

Databricks

Aug 2023 – Present Bellevue, WA
 
 
 
 
 

Senior Researcher

Microsoft

Aug 2020 – Aug 2023 Bellevue, WA
 
 
 
 
 

Research Intern

Alibaba Group

May 2019 – Aug 2019 Sunnyvale, CA
 
 
 
 
 

Teaching Assistant for the 9th Programming and Tuning Massively Parallel Systems + Artificial Intelligence summer school (PUMPS+AI)

BSC, UPC and UIUC

Jul 2018 – Jul 2018 Barcelona, Spain
 
 
 
 
 

Research Intern

IBM TJ Watson Research Center

May 2018 – Aug 2018 Yorktown Heights, NY
 
 
 
 
 

Research Intern

IBM TJ Watson Research Center

May 2017 – Aug 2017 Yorktown Heights, NY
 
 
 
 
 

Head Teaching Assistant for ECE408/CS483: Applied Parallel Programming

UIUC

Aug 2016 – Dec 2016 Champaign, IL

Publications

Quickly discover relevant content by filtering publications.

Languages

Python, C/C++, CUDA, Go, JavaScript, Bash

Chinese, English

Contact