1 code implementation • 4 May 2022 • Han-Jia Ye, Su Lu, De-Chuan Zhan
Instead of enforcing the teacher to work on the same task as the student, we borrow the knowledge from a teacher trained from a general label space -- in this "Generalized Knowledge Distillation (GKD)", the classes of the teacher and the student may be the same, completely different, or partially overlapped.
no code implementations • 25 Apr 2022 • Su Lu, Han-Jia Ye, De-Chuan Zhan
Our method reuses cross-task knowledge from a distinct label space and efficiently assesses teachers without enumerating the model repository.
no code implementations • 8 Apr 2021 • Su Lu, Han-Jia Ye, De-Chuan Zhan
In detail, given two videos, we sample segments from them and cast the calculation of their distance as an optimal transport problem between two segment sequences.
1 code implementation • NeurIPS 2021 • Su Lu, Han-Jia Ye, Le Gan, De-Chuan Zhan
Different from $\mathcal{S}$/$\mathcal{Q}$ protocol, we can also evaluate a task-specific solver by comparing it to a target model $\mathcal{T}$, which is the optimal model for this task or a model that behaves well enough on this task ($\mathcal{S}$/$\mathcal{T}$ protocol).