no code implementations • 27 Jun 2023 • Haitao Tang, Yu Fu, Lei Sun, Jiabin Xue, Dan Liu, Yongchao Li, Zhiqiang Ma, Minghui Wu, Jia Pan, Genshun Wan, Ming'en Zhao
In this paper, we propose an adaptive two-stage knowledge distillation method consisting of hidden layer learning and output layer learning.
no code implementations • 17 Apr 2019 • Jiabin Xue, Jiqing Han, Tieran Zheng, Xiang Gao, Jiaxing Guo
On the one hand, we constrain the new parameters not to deviate too far from the original parameters and punish the new system when forgetting original knowledge.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 17 Apr 2019 • Jiabin Xue, Jiqing Han, Tieran Zheng, Jiaxing Guo, Boyong Wu
Thus, the parameters are more influenced by the training samples with a big propagation error than the samples with a small one.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2