1 code implementation • ICLR 2021 • Tri Dao, Govinda M Kamath, Vasilis Syrgkanis, Lester Mackey
A popular approach to model compression is to train an inexpensive student model to mimic the class probabilities of a highly accurate but cumbersome teacher model.