no code implementations • NeurIPS 2020 • Guangda Ji, Zhanxing Zhu
In this paper, we theoretically analyze the knowledge distillation of a wide neural network.
Knowledge Distillation Model Compression +1