Pretrained Language Model in Continual Learning: A Comparative Study

ICLR 2022 · Tongtong Wu, Massimo Caccia, Zhuang Li, Yuan-Fang Li, Guilin Qi, Gholamreza Haffari ·

Continual learning (CL) is a real-world learning paradigm in which a model learns from a stream of incoming data while avoiding forgetting previously learned knowledge. Pre-trained language models (PLM) have been successfully employed in the continual learning of different natural-language problems. With the rapid development of many CL methods and PLMs, understanding and disentangling their interactions become essential for the continued improvement of CL performance. In this paper, we thoroughly compare the continual learning performance over the combination of 5 PLMs and 4 veins of CL methods on 3 benchmarks in 2 typical incremental settings. As the probing analysis dissects PLM's performance characteristics in a layer-wise and task-wise way, we propose a simple yet effective method ICLL (Introspective Continual Language Learning) which updates the inner connection of the pre-trained model to adapt to continual learning. Our experiments on three incremental sequence classification benchmarks show the generalizability of the proposed methods over different pre-trained language models.

PDF Abstract