2 code implementations • 8 Dec 2023 • Zigeng Chen, Gongfan Fang, Xinyin Ma, Xinchao Wang
To address this challenging trade-off, we introduce SlimSAM, a novel data-efficient SAM compression method that achieves superior performance with extremely less training data.
2 code implementations • 1 Dec 2023 • Xinyin Ma, Gongfan Fang, Xinchao Wang
Diffusion models have recently gained unprecedented attention in the field of image synthesis due to their remarkable generative capabilities.
1 code implementation • NeurIPS 2023 • Xinyin Ma, Gongfan Fang, Xinchao Wang
With LLM being a general-purpose task solver, we explore its compression in a task-agnostic manner, which aims to preserve the multi-task solving and language generation ability of the original LLM.
1 code implementation • NeurIPS 2023 • Gongfan Fang, Xinyin Ma, Xinchao Wang
Generative modeling has recently undergone remarkable advancements, primarily propelled by the transformative implications of Diffusion Probabilistic Models (DPMs).
1 code implementation • CVPR 2023 • Gongfan Fang, Xinyin Ma, Mingli Song, Michael Bi Mi, Xinchao Wang
Structural pruning enables model acceleration by removing structurally-grouped parameters from neural networks.
1 code implementation • 27 Jul 2022 • Donglin Xie, Ruonan Yu, Gongfan Fang, Jie Song, Zunlei Feng, Xinchao Wang, Li Sun, Mingli Song
The goal of FedSA is to train a student model for a new task with the help of several decentralized teachers, whose pre-training tasks and data are different and agnostic.
no code implementations • 16 May 2022 • Xinyin Ma, Xinchao Wang, Gongfan Fang, Yongliang Shen, Weiming Lu
Data-free knowledge distillation (DFKD) conducts knowledge distillation via eliminating the dependence of original training data, and has recently achieved impressive results in accelerating pre-trained language models.
1 code implementation • 7 Mar 2022 • Haofei Zhang, Feng Mao, Mengqi Xue, Gongfan Fang, Zunlei Feng, Jie Song, Mingli Song
Moreover, the transformer-based students excel in learning amalgamated knowledge, as they have mastered heterogeneous detection tasks rapidly and achieved superior or at least comparable performance to those of the teachers in their specializations.
2 code implementations • 12 Dec 2021 • Gongfan Fang, Kanya Mo, Xinchao Wang, Jie Song, Shitao Bei, Haofei Zhang, Mingli Song
At the heart of our approach is a novel strategy to reuse the shared common features in training data so as to synthesize different data instances.
2 code implementations • NeurIPS 2021 • Gongfan Fang, Yifan Bao, Jie Song, Xinchao Wang, Donglin Xie, Chengchao Shen, Mingli Song
Knowledge distillation~(KD) aims to craft a compact student model that imitates the behavior of a pre-trained teacher in a target domain.
3 code implementations • 18 May 2021 • Gongfan Fang, Jie Song, Xinchao Wang, Chengchao Shen, Xingen Wang, Mingli Song
In this paper, we propose Contrastive Model Inversion~(CMI), where the data diversity is explicitly modeled as an optimizable objective, to alleviate the mode collapse issue.
no code implementations • EMNLP 2020 • Xinyin Ma, Yongliang Shen, Gongfan Fang, Chen Chen, Chenghao Jia, Weiming Lu
To the best of our knowledge, our framework is the first data-free distillation framework designed for NLP tasks.
no code implementations • 10 Jul 2020 • Gongfan Fang, Xinchao Wang, Haofei Zhang, Jie Song, Mingli Song
This network is referred to as the {\emph{Template Network}} because its filters will be used as templates to reconstruct images from the impression.
3 code implementations • 23 Dec 2019 • Gongfan Fang, Jie Song, Chengchao Shen, Xinchao Wang, Da Chen, Mingli Song
Knowledge Distillation (KD) has made remarkable progress in the last few years and become a popular paradigm for model compression and knowledge transfer.
2 code implementations • 24 Jun 2019 • Sihui Luo, Xinchao Wang, Gongfan Fang, Yao Hu, Dapeng Tao, Mingli Song
An increasing number of well-trained deep networks have been released online by researchers and developers, enabling the community to reuse them in a plug-and-play way without accessing the training annotations.