Starting from 2024, I’m a research assistant professor in the NICS-EFC group at the Department of Electronic Engineering, Tsinghua University. I obtained my B.S. and Ph.D. degrees from the Department of Electronic Engineering, Tsinghua University, in 2016 and 2021, advised by Prof. Huazhong Yang and Prof. Yu Wang. I spent two years (from December 2021 to December 2023) as a post-doctoral researcher with Prof. Yu Wang and Prof. Pinyan Lu.
I built and am currently leading the Efficient Deep Learning Algorithm (EffAlg) Team in the NICS-EFC group. Our team is continuously looking for self-motivated master or Ph.D. students, postdoctoral scholars, research assistants, and visiting students.Please send me and Prof. Wang an email with your CV if you’re interested!
NOTE in 2025: After nearly five years of building the team from the ground up, our team is entering a new phase in 2025! From now on, I personally may not be able to advise or participate in quite some projects within my team. Despite that, I think that we have already established strong pipelines, a comprehensive knowledge base, a solid platform, and more importantly, a good group of people for new students to find their collaboration and grow in our team. Therefore, our team continues to welcome new students.Services
Selected Talks
-
Zhiyuan's Annual Discussion on the 10 AI Trends - Inference Optimization Slide
A talk to summarize the key work and my opinions on the near-term trends at the start of 2025.
-
Introduction to NICS-EFC Lab Efficient Algorithm Team Slide Video
A talk to introduce our EffAlg Group. @AI Time.
-
Generative Model Compression and Acceleration Slide Video
A talk on generative model compression and acceleration. @Huawei; Apple-China; AMD-China; VIVO; University of Chinese Academy of Sciences; SCUT; and others.
-
An Introduction to Quantization of Large Language Models Slide Video
A talk about efficient LLM with a special focus on quantization for an Competition organized by AWS-China.
-
Model Compression Towards Efficient Deep Learning Inference Slide
A talk on model compression towards efficient DL inference. @Huawei; Inceptio.ai; Beihang University; and others.
-
Neural Architecture Search and Architecture Encoding Slide
A talk on NAS researches. @DAMO Academy of Alibaba Group (U.S.); Renmin University of China; and others.
-
A Simple Survey on Auto-Parallelism Slide
An internal Chinese survey on auto parallelism methods in 2022.