1 code implementation • 10 Jul 2023 • Shiya Luo, Defang Chen, Can Wang
Existing works generally synthesize data from the pre-trained teacher model to replace the original training data for student learning.
no code implementations • 16 Feb 2022 • Shiya Luo, Defang Chen, Can Wang
Knowledge distillation aims to enhance the performance of a lightweight student model by exploiting the knowledge from a pre-trained cumbersome teacher model.