Stingy Teacher: Sparse Logits Suffice to Fail Knowledge Distillation

29 Sep 2021 · Haoyu Ma, Yifan Huang, Tianlong Chen, Hao Tang, Chenyu You, Zhangyang Wang, Xiaohui Xie ·

Knowledge distillation (KD) aims to transfer the discrimination power of pre-trained teacher models to (more lightweight) student models. However, KD also poses the risk of intellectual properties (IPs) leakage of teacher models. Even if the teacher model is released as a black box, it can still be cloned through KD by imitating input-output behaviors. To address this unwanted effect of KD, the concept of Nasty Teacher was proposed recently. It is a special network that achieves nearly the same accuracy as a normal one, but significantly degrades the accuracy of student models trying to imitate it. Previous work builds the nasty teacher by retraining a new model and distorting its output distribution from the normal one via an adversarial loss. With this design, the ``nasty" teacher tends to produce sparse and noisy logits. However, it is unclear why the distorted distribution of the logits is catastrophic to the student model. In addition, the retraining process used in Nasty Teacher is undesirable, not only degrading the performance of the teacher model but also limiting its applicability to large datasets. In this paper, we provide a theoretical analysis of why the sparsity of logits is key to Nasty Teacher. We further propose Stingy Teacher, a much simpler yet more effective algorithm to prevent imitation through KD without incurring accuracy drop or requiring retraining. Stingy Teacher directly manipulates the logits of a standard pre-trained network by maintaining the values for a small subset of classes while zeroing out the rest. Extensive experiments on large-scale datasets and various teacher-student pairs demonstrate that our stingy teacher is highly effective and more catastrophic to student models than the Nasty Teacher. Code and pre-trained models will be released upon acceptance.

PDF Abstract