Publications

  • ArXiv 2024
    Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs
    Authors: Enshu Liu*, Junyi Zhu*, Zinan Lin+, Xuefei Ning+, Matthew B. Blaschko, Shengen Yan, Guohao Dai, Huazhong Yang, Yu Wang+ Paper Code
  • ArXiv 2024
    Can LLMs Learn by Teaching? A Preliminary Study
    Authors: Xuefei Ning*+, Zifu Wang*, Shiyao Li*, Zinan Lin*+, Peiran Yao*, Tianyu Fu, Matthew B. Blaschko, Guohao Dai, Huazhong Yang, Yu Wang+ Paper Code
  • ArXiv 2024
    MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression
    Authors: Tianyu Fu*, Haofeng Huang*, Xuefei Ning*+, Genghan Zhang, Boju Chen, Tianqi Wu, Hongyi Wang, Zixiao Huang, Shiyao Li, Shengen Yan, Guohao Dai, Huazhong Yang, Yu Wang+ Paper Code
  • ArXiv 2024
    DiTFastAttn: Attention Compression for Diffusion Transformer Models
    Authors: Zhihang Yuan*, Pu Lu*, Hanling Zhang*, Xuefei Ning+, Linfeng Zhang, Tianchen Zhao, Shengen Yan, Guohao Dai, Yu Wang+ Paper Code
  • ArXiv 2024
    ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
    Authors: Tianchen Zhao, Tongcheng Fang, Enshu Liu, Wan Rui, Widyadewi Soedarmadji, Shiyao Li, Zinan Lin, Guohao Dai, Shengen Yan, Huazhong Yang, Xuefei Ning+, Yu Wang+ Paper
  • ArXiv 2024
    DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis
    Authors: Yao Teng, Yue Wu, Han Shi, Xuefei Ning, Guohao Dai, Yu Wang, Zhenguo Li, Xihui Liu+ Paper Code
  • ArXiv 2024
    Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better
    Authors: Enshu Liu*, Junyi Zhu*, Zinan Lin+, Xuefei Ning+, Matthew B. Blaschko, Sergey Yekhanin, Shengen Yan, Guohao Dai, Huazhong Yang, Yu Wang+ Paper Code
  • ECCV 2024
    MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization
    Authors: Tianchen Zhao*, Xuefei Ning*+, Tongcheng Fang*, Enshu Liu, Guyue Huang, Zinan Lin, Shengen Yan, Guohao Dai, and Yu Wang+ Paper Code
  • ArXiv 2024
    A Survey on Efficient Inference for Large Language Models
    Authors: Zixuan Zhou*, Xuefei Ning*+, Ke Hong*, Tianyu Fu, Jiaming Xu, Shiyao Li, Yuming Lou, Luning Wang, Zhihang Yuan, Xiuhong Li, Shengen Yan, Guohao Dai+, Xiao-Ping Zhang, Yuhan Dong, Yu Wang+ Paper
  • ArXiv 2024
    LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K
    Authors: Tao Yuan, Xuefei Ning+, Dong Zhou, Zhijie Yang, Shiyao Li, Minghui Zhuang, Zheyue Tan, Zhuyu Yao, Dahua Lin, Boxun Li, Guohao Dai+, Shengen Yan, Yu Wang+ Paper Code
  • FPGA 2024
    FlightLLM: Efficient Large Language Model Inference with a Complete Mapping Flow on FPGAs
    Authors: Shulin Zeng*, Jun Liu*, Guohao Dai+, Xinhao Yang, Tianyu Fu, Hongyi Wang, Wenheng Ma, Hanbo Sun, Shiyao Li, Zixiao Huang, Yadong Dai, Jintao Li, Zehao Wang, Ruoyu Zhang, Kairui Wen, Xuefei Ning, Yu Wang+ Paper
  • DATE 2024
    DyPIM: Dynamic-inference-enabled Processing-In-Memory Accelerator
    Authors: Tongxin Xie, Tianchen Zhao, Zhenhua Zhu, Xuefei Ning, Bing Li, Guohao Dai, Huazhong Yang, Yu Wang Paper
  • ICML 2024
    Evaluating Quantized Large Language Models
    Authors: Shiyao Li, Xuefei Ning+, Luning Wang, Tengxuan Liu, Xiangsheng Shi, Shengen Yan, Guohao Dai, Huazhong Yang, Yu Wang+ Paper Code
  • CVPR 2024
    FlashEval: Towards Fast and Accurate Evaluation of Text-to-image Diffusion Generative Models
    Authors: Lin Zhao*, Tianchen Zhao*, Zinan Lin+, Xuefei Ning+, Guohao Dai, Huazhong Yang, Yu Wang+ Paper Code
  • ICLR 2024
    A Unified Sampling Framework for Solver Searching of Diffusion Probabilistic Models
    Authors: Enshu Liu, Xuefei Ning+, Huazhong Yang, Yu Wang+ Paper
  • ICLR 2024
    Skeleton-of-Thought: Prompting Large Language Models for Efficient Parallel Generation
    Authors: Xuefei Ning*+, Zinan Lin*, Zixuan Zhou*, Zifu Wang, Huazhong Yang, Yu Wang+ Paper Code
  • WACV 2024
    TCP: Triplet Contrastive-relationship Preserving for Class-Incremental Learning
    Authors: Shiyao Li, Xuefei Ning+, Shanghang Zhang, Lidong Guo, Tianchen Zhao, Huazhong Yang, Yu Wang+ Paper
  • NeurIPS Workshop 2023
    LLM-MQ: Mixed-precision Quantization for Efficient LLM Deployment
    Authors: Shiyao Li, Xuefei Ning+, Ke Hong, Tengxuan Liu, Luning Wang, Xiuhong Li, Kai Zhong, Guohao Dai, Huazhong Yang, Yu Wang+ Paper
  • ICCV 2023
    Ada3D: Exploiting the Spatial Redundancy with Adaptive Inference for Efficient 3D Object Detection
    Authors: Tianchen Zhao, Xuefei Ning+, Ke Hong, Zhongyuan Qiu, Pu Lu, Linfeng Zhang, Yali Zhao, Lipu Zhou, Guohao Dai, Huazhong Yang, Yu Wang+ Paper
  • ICML 2023
    OMS-DPM: Deciding The Optimal Model Schedule for Diffusion Probabilistic Model
    Authors: Enshu Liu*, Xuefei Ning*+, Zinan Lin*, Huazhong Yang, Yu Wang+ Paper Code
  • AAAI 2023 (Oral)
    Dynamic Ensemble of Low-fidelity Experts: Mitigating NAS "Cold-Start"
    Authors: Junbo Zhao*, Xuefei Ning*+, Enshu Liu, Binxin Ru, Zixuan Zhou, Tianchen Zhao, Chen Chen, Jiajin Zhang, Qingmin Liao, Yu Wang+ Paper Code
  • AAAI 2023
    Memory-Oriented Structural Pruning for Efficient Image Restoration
    Authors: Xiangsheng Shi*, Xuefei Ning*+, Lidong Guo*, Tianchen Zhao, Enshu Liu, Yi Cai, Yuhan Dong, Huazhong Yang, Yu Wang+ Paper
  • AAAI 2023
    Ensemble-in-One: Ensemble Learning within Random Gated Networks for Enhanced Adversarial Robustness
    Authors: Yi Cai, Xuefei Ning, Huazhong Yang, Yu Wang Paper
  • TPAMI 2023
    A Generic Graph-based Neural Architecture Encoding Scheme with Multifaceted Information
    Authors: Xuefei Ning, Yin Zheng, Zixuan Zhou, Tianchen Zhao, Huazhong Yang, Yu Wang Paper Code
  • DATE 2022 & TCAD 2023
    Gibbon: Efficient Co-Exploration of NN Model and Processing-In-Memory Architecture
    Authors: Hanbo Sun*, Chenyu Wang*, Zhenhua Zhu, Xuefei Ning+, Guohao Dai, Huazhong Yang, Yu Wang+ Paper
  • TCAD 2022
    Exploring the Potential of Low-bit Training of Convolutional Neural Networks
    Authors: Kai Zhong, Xuefei Ning, Guohao Dai, Zhenhua Zhu, Tianchen Zhao, Shulin Zeng, Yu Wang+, Huazhong Yang Paper
  • CVPR 2022
    CodedVTR: Codebook-based Sparse Voxel Transformer with Geometric Guidance
    Authors: Tianchen Zhao, Niansong Zhang, Xuefei Ning, He Wang, Li Yi, Yu Wang Paper
  • CVPR 2022
    FedCor: Correlation-Based Active Client Selection Strategy for Heterogeneous Federated Learning
    Authors: Minxue Tang, Xuefei Ning, Yitu Wang, Jingwei Sun, Yu Wang, Hai Li, Yiran Chen+ Paper
  • ECCV 2022
    CLOSE: Curriculum Learning On the Sharing Extent Towards Better One-shot NAS
    Authors: Zixuan Zhou*, Xuefei Ning*+, Yi Cai, Jiashu Han, Yiping Deng, Yuhan Dong, Huazhong Yang, Yu Wang+ Paper
  • NeurIPS 2022 (Spotlight)
    TA-GATES: An Encoding Scheme for Neural Network Architectures
    Authors: Xuefei Ning*+, Zixuan Zhou*, Junbo Zhao, Tianchen Zhao, Yiping Deng, Changcheng Tang, Shuang Liang, Huazhong Yang, Yu Wang+ Paper
  • Low-Power CV 2022
    Hardware Design and Software Practices for Efficient Neural Network Inference
    Authors: Yu Wang, Xuefei Ning, Shulin Zeng, Yi Cai, Kaiyuan Guo, Hanbo Sun, Changcheng Tang, Tianyi Lu, Shuang Liang, Tianchen Zhao Paper
  • NeurIPS 2021
    Evaluating Efficient Performance Estimators of Neural Architectures
    Authors: Xuefei Ning+, Changcheng Tang, Wenshuo Li, Zixuan Zhou, Shuang Liang, Huazhong Yang+, Yu Wang+ Paper Code
  • ASP-DAC 2020
    Black Box Search Space Profiling for Accelerator-Aware Neural Architecture Search
    Authors: Shulin Zeng, Hanbo Sun, Yu Xing, Xuefei Ning, Yi Shan, Xiaoming Chen, Yu Wang, Huazhong Yang Paper Code
  • ECCV 2020
    A Generic Graph-based Neural Architecture Encoding Scheme for Predictor-based NAS
    Authors: Xuefei Ning, Yin Zheng, Tianchen Zhao, Yu Wang, Huazhong Yang Paper Code
  • ECCV 2020 (Spotlight)
    DSA: More Efficient Budgeted Pruning via Differentiable Sparsity Allocation
    Authors: Xuefei Ning*, Tianchen Zhao*, Wenshuo Li, Peng Lei, Yu Wang, Huazhong Yang Paper
  • ArXiv 2020
    aw_nas: A Modularized and Extensible NAS framework
    Authors: Xuefei Ning, Changcheng Tang, Wenshuo Li, Songyi Yang, Tianchen Zhao, Niansong Zhang, Tianyi Lu, Shuang Liang, Huazhong Yang, Yu Wang Paper Code