Continual Pretraining

22 papers with code • 3 benchmarks • 3 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Continual Pretraining

Dataset	Best Model	Compare
AG News	CPT	See all
ACL-ARC	DAS	See all
SciERC	DAS	See all

Libraries

Use these libraries to find Continual Pretraining models and implementations

zixuanke/pycontinual

2 papers

277

UIC-Liu-Lab/ContinualLM

2 papers

215

Datasets

Most implemented papers

Most implemented Social Latest No code

Fortunately, Discourse Markers Can Enhance Language Models for Sentiment Analysis

ibm/tslm-discourse-markers • 6 Jan 2022

In recent years, pretrained language models have revolutionized the NLP world, while achieving state of the art performance in various downstream tasks.

Paper
Code

Hierarchical Label-wise Attention Transformer Model for Explainable ICD Coding

leiboliu/hilat • • 22 Apr 2022

In this study, we propose a hierarchical label-wise attention Transformer model (HiLAT) for the explainable prediction of ICD codes from clinical documents.

Paper
Code

Continual Pre-Training Mitigates Forgetting in Language and Vision

andreacossu/continual-pretraining-nlp-vision • • 19 May 2022

We formalize and investigate the characteristics of the continual pre-training scenario in both language and vision environments, where a model is continually pre-trained on a stream of incoming data and only later fine-tuned to different downstream tasks.

Paper
Code

Unsupervised Domain Adaptation for Sparse Retrieval by Filling Vocabulary and Word Frequency Gaps

meshidenn/cai • • 8 Nov 2022

We conducted experiments using our method on datasets with a large vocabulary gap from a source domain.

Paper
Code

AF Adapter: Continual Pretraining for Building Chinese Biomedical Language Model

yanyongyu/af-adapter • • 21 Nov 2022

Continual pretraining is a popular way of building a domain-specific pretrained language model from a general-domain language model.

Paper
Code

CTP:Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation

kevinlight831/ctp • • ICCV 2023

Regarding the growing nature of real-world data, such an offline training paradigm on ever-expanding data is unsustainable, because models lack the continual learning ability to accumulate knowledge constantly.

Paper
Code

CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation

kevinlight831/ctp • • 14 Aug 2023

Paper
Code

Effective Long-Context Scaling of Foundation Models

openlmlab/leval • • 27 Sep 2023

We also examine the impact of various design choices in the pretraining process, including the data mix and the training curriculum of sequence lengths -- our ablation experiments suggest that having abundant long texts in the pretrain dataset is not the key to achieving strong performance, and we empirically verify that long context continual pretraining is more efficient and similarly effective compared to pretraining from scratch with long sequences.

Paper
Code

PECoP: Parameter Efficient Continual Pretraining for Action Quality Assessment

plrbear/pecop • • 11 Nov 2023

The limited availability of labelled data in Action Quality Assessment (AQA), has forced previous works to fine-tune their models pretrained on large-scale domain-general datasets.

Paper
Code

Continual Learning for Large Language Models: A Survey

wang-ml-lab/llm-continual-learning-survey • • 2 Feb 2024

Large language models (LLMs) are not amenable to frequent re-training, due to high training costs arising from their massive scale.

Paper
Code

Continual Pretraining

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result