Learning from Students: Online Contrastive Distillation Network for General Continual Learning

Conference 2022 · Jin Li, Zhong Ji, Gang Wang, Qiang Wang, Feng Gao ·

The goal of General Continual Learning (GCL) is to preserve learned knowledge and learn new knowledge with constant memory from an infinite data stream where task boundaries are blurry. Distilling the model’s response of reserved samples between the old and the new models is an effective way to achieve promising performance on GCL. However, it accumulates the inherent old model’s response bias and is not robust to model changes. To this end, we propose an Online Contrastive Distillation Network (OCD-Net) to tackle these problems, which explores the merit of the student model in each time step to guide the training process of the teacher model. Concretely, the teacher model is devised to help the student model to consolidate the learned knowledge, which is trained online via integrating the parameters of the student model to accumulate the new knowledge. Moreover, our OCD-Net incorporates both relation and adaptive response to help the student model alleviate the catastrophic forgetting, which is also beneficial for the teacher model to preserve the learned knowledge. Extensive experiments on six benchmark datasets demonstrate that our OCD-Net significantly outperforms state-of-the-art approaches in 3.03% ∼ 8.71% with various buffer sizes.

PDF Abstract