ASGD Weight-Dropped LSTM

Introduced by Merity et al. in Regularizing and Optimizing LSTM Language Models

ASGD Weight-Dropped LSTM, or AWD-LSTM, is a type of recurrent neural network that employs DropConnect for regularization, as well as NT-ASGD for optimization - non-monotonically triggered averaged SGD - which returns an average of last iterations of weights. Additional regularization techniques employed include variable length backpropagation sequences, variational dropout, embedding dropout, weight tying, independent embedding/hidden size, activation regularization and temporal activation regularization.

Source: Regularizing and Optimizing LSTM Language Models

Latest Papers

PAPER DATE
Pagsusuri ng RNN-based Transfer Learning Technique sa Low-Resource Language
| Dan John Velasco
2020-10-13
[email protected]: Pre-training ULMFiT on Synthetically Generated Code-Mixed Data for Hate Speech Detection
Gaurav Arora
2020-10-05
Fine-tuning Pre-trained Contextual Embeddings for Citation Content Analysis in Scholarly Publication
Haihua ChenHuyen Nguyen
2020-09-12
HinglishNLP: Fine-tuned Language Models for Hinglish Sentiment Detection
Meghana BhangeNirant Kasliwal
2020-08-22
Composer Style Classification of Piano Sheet Music Images Using Language Model Pretraining
TJ TsaiKevin Ji
2020-07-29
Probing for Referential Information in Language Models
Ionut-Teodor SorodocKristina GulordavaGemma Boleda
2020-07-01
Text Categorization for Conflict Event Annotation
Fredrik OlssonMagnus SahlgrenFehmi ben AbdesslemAriel EkgrenKristine Eck
2020-05-01
Offensive language detection in Arabic using ULMFiT
Mohamed AbdellatifAhmed Elgammal
2020-05-01
Evaluation Metrics for Headline Generation Using Deep Pre-Trained Embeddings
Abdul MoeedYang AnGerhard HagererGeorg Groh
2020-05-01
Inferring the source of official texts: can SVM beat ULMFiT?
| Pedro Henrique Luz de AraujoTeófilo Emidio de CamposMarcelo Magalhães Silva de Sousa
2020-03-02
MaxUp: A Simple Way to Improve Generalization of Neural Network Training
Chengyue GongTongzheng RenMao YeQiang Liu
2020-02-20
Localized Flood DetectionWith Minimal Labeled Social Media Data Using Transfer Learning
Neha SinghNirmalya RoyAryya Gangopadhyay
2020-02-10
Natural language processing of MIMIC-III clinical notes for identifying diagnosis and procedures with neural networks
Siddhartha NuthakkiSunil NeelaJudy W. GichoyaSaptarshi Purkayastha
2019-12-28
A Comparative Study of Pretrained Language Models on Thai Social Text Categorization
Thanapapas HorsuwanKasidis KanwatcharaPeerapon VateekulBoonserm Kijsirikul
2019-12-03
DeFINE: DEep Factorized INput Token Embeddings for Neural Sequence Modeling
Sachin MehtaRik Koncel-KedziorskiMohammad RastegariHannaneh Hajishirzi
2019-11-27
A Subword Level Language Model for Bangla Language
Aisha KhatunAnisur RahmanHemayet Ahmed ChowdhuryMd. Saiful IslamAyesha Tasnim
2019-11-15
Evolution of transfer learning in natural language processing
Aditya MaltePratik Ratadiya
2019-10-16
The merits of Universal Language Model Fine-tuning for Small Datasets -- a case with Dutch book reviews
Benjamin van der BurghSuzan Verberne
2019-10-02
Analyzing Customer Feedback for Product Fit Prediction
Stephan Baier
2019-08-28
Low-Shot Classification: A Comparison of Classical and Deep Transfer Machine Learning Approaches
Peter UsherwoodSteven Smit
2019-07-17
Evaluating Language Model Finetuning Techniques for Low-resource Languages
| Jan Christian Blaise CruzCharibeth Cheng
2019-06-30
Exploiting Unsupervised Pre-training and Automated Feature Engineering for Low-resource Hate Speech Detection in Polish
Renard KorzeniowskiRafał RolczyńskiPrzemysław SadownikTomasz KorbakMarcin Możejko
2019-06-17
Speak up, Fight Back! Detection of Social Media Disclosures of Sexual Harassment
| Arijit Ghosh ChowdhuryRamit SawhneyPuneet MathurDebanjan MahataRajiv Ratn Shah
2019-06-01
Figure Eight at SemEval-2019 Task 3: Ensemble of Transfer Learning Methods for Contextual Emotion Detection
Joan Xiao
2019-06-01
An Empirical Evaluation of Text Representation Schemes on Multilingual Social Web to Filter the Textual Aggression
Sandip ModhaPrasenjit Majumder
2019-04-16
Low Resource Text Classification with ULMFit and Backtranslation
Sam Shleifer
2019-03-21
Language Informed Modeling of Code-Switched Text
ChKhyathi uThomas ManziniSumeet SinghAlan W. Black
2018-07-01
Universal Language Model Fine-tuning for Text Classification
| Jeremy HowardSebastian Ruder
2018-01-18
Breaking the Softmax Bottleneck: A High-Rank RNN Language Model
| Zhilin YangZihang DaiRuslan SalakhutdinovWilliam W. Cohen
2017-11-10
Regularizing and Optimizing LSTM Language Models
| Stephen MerityNitish Shirish KeskarRichard Socher
2017-08-07

Categories