Pretrained Language Model in Continual Learning: A Comparative Study

Continual learning (CL) is a real-world learning paradigm in which a model learns from a stream of incoming data while avoiding forgetting previously learned knowledge. Pre-trained language models (PLM) have been successfully employed in the continual learning of different natural-language problems. With the rapid development of many CL methods and PLMs, understanding and disentangling their interactions become essential for the continued improvement of CL performance. In this paper, we thoroughly compare the continual learning performance over the combination of 5 PLMs and 4 veins of CL methods on 3 benchmarks in 2 typical incremental settings. As the probing analysis dissects PLM's performance characteristics in a layer-wise and task-wise way, we propose a simple yet effective method ICLL (Introspective Continual Language Learning) which updates the inner connection of the pre-trained model to adapt to continual learning. Our experiments on three incremental sequence classification benchmarks show the generalizability of the proposed methods over different pre-trained language models.

PDF Abstract

Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here