Universal Language Model Fine-tuning

Introduced by Howard et al. in Universal Language Model Fine-tuning for Text Classification

Universal Language Model Fine-tuning, or ULMFiT, is an architecture and transfer learning method that can be applied to NLP tasks. It involves a 3-layer AWD-LSTM architecture for its representations. The training consists of three steps: 1) general language model pre-training on a Wikipedia-based text, 2) fine-tuning the language model on a target task, and 3) fine-tuning the classifier on the target task.

As different layers capture different types of information, they are fine-tuned to different extents using discriminative fine-tuning. Training is performed using Slanted triangular learning rates (STLR), a learning rate scheduling strategy that first linearly increases the learning rate and then linearly decays it.

Fine-tuning the target classifier is achieved in ULMFiT using gradual unfreezing. Rather than fine-tuning all layers at once, which risks catastrophic forgetting, ULMFiT gradually unfreezes the model starting from the last layer (i.e., closest to the output) as this contains the least general knowledge. First the last layer is unfrozen and all unfrozen layers are fine-tuned for one epoch. Then the next group of frozen layers is unfrozen and fine-tuned and repeat, until all layers are fine-tuned until convergence at the last iteration.

Source: Universal Language Model Fine-tuning for Text Classification

Latest Papers

Experimental Evaluation of Deep Learning models for Marathi Text Classification
Atharva KulkarniMeet MandhaneManali LikhitkarGayatri KshirsagarJayashree JagdaleRaviraj Joshi
LaDiff ULMFiT: A Layer Differentiated training approach for ULMFiT
Mohammed AzhanMohammad Ahmad
Palomino-Ochoa at SemEval-2020 Task 9: Robust System based on Transformer for Code-Mixed Sentiment Classification
Daniel PalominoJose Ochoa-Luna
Pagsusuri ng RNN-based Transfer Learning Technique sa Low-Resource Language
| Dan John Velasco
[email protected]: Pre-training ULMFiT on Synthetically Generated Code-Mixed Data for Hate Speech Detection
Gaurav Arora
FarsTail: A Persian Natural Language Inference Dataset
| Hossein AmirkhaniMohammad Azari JafariAzadeh AmirakZohreh PourjafariSoroush Faridan JahromiZeinab Kouhkan
Fine-tuning Pre-trained Contextual Embeddings for Citation Content Analysis in Scholarly Publication
Haihua ChenHuyen Nguyen
HinglishNLP: Fine-tuned Language Models for Hinglish Sentiment Detection
| Meghana BhangeNirant Kasliwal
Text Categorization for Conflict Event Annotation
Fredrik OlssonMagnus SahlgrenFehmi ben AbdesslemAriel EkgrenKristine Eck
Offensive language detection in Arabic using ULMFiT
Mohamed AbdellatifAhmed Elgammal
Evaluation Metrics for Headline Generation Using Deep Pre-Trained Embeddings
Abdul MoeedYang AnGerhard HagererGeorg Groh
Inferring the source of official texts: can SVM beat ULMFiT?
| Pedro Henrique Luz de AraujoTeófilo Emidio de CamposMarcelo Magalhães Silva de Sousa
Localized Flood DetectionWith Minimal Labeled Social Media Data Using Transfer Learning
Neha SinghNirmalya RoyAryya Gangopadhyay
Natural language processing of MIMIC-III clinical notes for identifying diagnosis and procedures with neural networks
Siddhartha NuthakkiSunil NeelaJudy W. GichoyaSaptarshi Purkayastha
A Comparative Study of Pretrained Language Models on Thai Social Text Categorization
Thanapapas HorsuwanKasidis KanwatcharaPeerapon VateekulBoonserm Kijsirikul
Evolution of transfer learning in natural language processing
Aditya MaltePratik Ratadiya
The merits of Universal Language Model Fine-tuning for Small Datasets -- a case with Dutch book reviews
Benjamin van der BurghSuzan Verberne
Analyzing Customer Feedback for Product Fit Prediction
Stephan Baier
Low-Shot Classification: A Comparison of Classical and Deep Transfer Machine Learning Approaches
Peter UsherwoodSteven Smit
Evaluating Language Model Finetuning Techniques for Low-resource Languages
| Jan Christian Blaise CruzCharibeth Cheng
Exploiting Unsupervised Pre-training and Automated Feature Engineering for Low-resource Hate Speech Detection in Polish
Renard KorzeniowskiRafał RolczyńskiPrzemysław SadownikTomasz KorbakMarcin Możejko
Speak up, Fight Back! Detection of Social Media Disclosures of Sexual Harassment
| Arijit Ghosh ChowdhuryRamit SawhneyPuneet MathurDebanjan MahataRajiv Ratn Shah
Figure Eight at SemEval-2019 Task 3: Ensemble of Transfer Learning Methods for Contextual Emotion Detection
Joan Xiao
An Empirical Evaluation of Text Representation Schemes on Multilingual Social Web to Filter the Textual Aggression
Sandip ModhaPrasenjit Majumder
Low Resource Text Classification with ULMFit and Backtranslation
Sam Shleifer
Universal Language Model Fine-tuning for Text Classification
| Jeremy HowardSebastian Ruder