Discriminative Fine-Tuning

Introduced by Howard et al. in Universal Language Model Fine-tuning for Text Classification

Discriminative Fine-Tuning is a fine-tuning strategy that is used for ULMFiT type models. Instead of using the same learning rate for all layers of the model, discriminative fine-tuning allows us to tune each layer with different learning rates. For context, the regular stochastic gradient descent (SGD) update of a model’s parameters $\theta$ at time step $t$ looks like the following (Ruder, 2016):

$$ \theta_{t} = \theta_{t-1} − \eta\cdot\nabla_{\theta}J\left(\theta\right)$$

where $\eta$ is the learning rate and $\nabla_{\theta}J\left(\theta\right)$ is the gradient with regard to the model’s objective function. For discriminative fine-tuning, we split the parameters $\theta$ into {$\theta_{1}, \ldots, \theta_{L}$} where $\theta_{l}$ contains the parameters of the model at the $l$-th layer and $L$ is the number of layers of the model. Similarly, we obtain {$\eta_{1}, \ldots, \eta_{L}$} where $\theta_{l}$ where $\eta_{l}$ is the learning rate of the $l$-th layer. The SGD update with discriminative finetuning is then:

$$ \theta_{t}^{l} = \theta_{t-1}^{l} - \eta^{l}\cdot\nabla_{\theta^{l}}J\left(\theta\right) $$

The authors find that empirically it worked well to first choose the learning rate $\eta^{L}$ of the last layer by fine-tuning only the last layer and using $\eta^{l-1}=\eta^{l}/2.6$ as the learning rate for lower layers.

Source: Universal Language Model Fine-tuning for Text Classification

Latest Papers

PAPER DATE
Better Distractions: Transformer-based Distractor Generation and Multiple Choice Question Filtering
Jeroen OfferijnsSuzan VerberneTessa Verhoef
2020-10-19
DA-Transformer: Distance-aware Transformer
Chuhan WuFangzhao WuYongfeng Huang
2020-10-14
Decoding Methods for Neural Narrative Generation
| Alexandra DeLuciaAaron MuellerXiang Lisa LiJoão Sedoc
2020-10-14
The workweek is the best time to start a family -- A Study of GPT-2 Based Claim Generation
Shai GretzYonatan BiluEdo Cohen-KarlikNoam Slonim
2020-10-13
Pagsusuri ng RNN-based Transfer Learning Technique sa Low-Resource Language
| Dan John Velasco
2020-10-13
Meta-Context Transformers for Domain-Specific Response Generation
Debanjana KarSuranjana SamantaAmar Prakash Azad
2020-10-12
Incremental Processing in the Age of Non-Incremental Encoders: An Empirical Assessment of Bidirectional Models for Incremental NLU
| Brielen MadureiraDavid Schlangen
2020-10-11
Discriminatively-Tuned Generative Classifiers for Robust Natural Language Inference
| Xiaoan DingTianyu LiuBaobao ChangZhifang SuiKevin Gimpel
2020-10-08
Investigating African-American Vernacular English in Transformer-Based Text Generation
Sophie GroenwoldLily OuAesha ParekhSamhita HonnavalliSharon LevyDiba MirzaWilliam Yang Wang
2020-10-06
GenAug: Data Augmentation for Finetuning Text Generators
Steven Y. FengVarun GangalDongyeop KangTeruko MitamuraEduard Hovy
2020-10-05
[email protected]: Pre-training ULMFiT on Synthetically Generated Code-Mixed Data for Hate Speech Detection
Gaurav Arora
2020-10-05
Inquisitive Question Generation for High Level Text Comprehension
Wei-Jen KoTe-Yuan ChenYiyan HuangGreg DurrettJunyi Jessy Li
2020-10-04
Examining the rhetorical capacities of neural language models
Zining ZhuChuer PanMohamed AbdallaFrank Rudzicz
2020-10-01
Visually-Grounded Planning without Vision: Language Models Infer Detailed Plans from High-level Instructions
| Peter A. Jansen
2020-09-29
The design and implementation of Language Learning Chatbot with XAI using Ontology and Transfer Learning
Nuobei ShiQin ZengRaymond Lee
2020-09-29
On Data Augmentation for Extreme Multi-label Classification
Danqing ZhangTao LiHaiyang ZhangBing Yin
2020-09-22
Prior Art Search and Reranking for Generated Patent Text
Jieh-Sheng LeeJieh Hsiang
2020-09-19
Hierarchical GPT with Congruent Transformers for Multi-Sentence Language Models
Jihyeon RohHuiseong GimSoo-Young Lee
2020-09-18
The Radicalization Risks of GPT-3 and Advanced Neural Language Models
Kris McGuffieAlex Newhouse
2020-09-15
Dialogue Response Ranking Training with Large-Scale Human Feedback Data
| Xiang GaoYizhe ZhangMichel GalleyChris BrockettBill Dolan
2020-09-15
Critical Thinking for Language Models
Gregor Betz
2020-09-15
GeDi: Generative Discriminator Guided Sequence Generation
Ben KrauseAkhilesh Deepak GotmareBryan McCannNitish Shirish KeskarShafiq JotyRichard SocherNazneen Fatema Rajani
2020-09-14
Fine-tuning Pre-trained Contextual Embeddings for Citation Content Analysis in Scholarly Publication
Haihua ChenHuyen Nguyen
2020-09-12
Brain2Word: Decoding Brain Activity for Language Generation
Nicolas AffolterBeni EgressyDamian PascualRoger Wattenhofer
2020-09-10
Modern Methods for Text Generation
| Dimas Munoz Montesinos
2020-09-10
Improving Language Generation with Sentence Coherence Objective
Ruixiao SunJie YangMehrdad Yousefzadeh
2020-09-07
Black Box to White Box: Discover Model Characteristics Based on Strategic Probing
Josh KalinMatthew CiolinoDavid NoeverGerry Dozier
2020-09-07
Comparative Evaluation of Pretrained Transfer Learning Models on Automatic Short Answer Grading
Sasi Kiran GaddipatiDeebul NairPaul G. Plöger
2020-09-02
Knowledge Efficient Deep Learning for Natural Language Processing
Hai Wang
2020-08-28
DAVE: Deriving Automatically Verilog from English
Hammond PearceBenjamin TanRamesh Karri
2020-08-27
HinglishNLP: Fine-tuned Language Models for Hinglish Sentiment Detection
Meghana BhangeNirant Kasliwal
2020-08-22
Generative Models are Unsupervised Predictors of Page Quality: A Colossal-Scale Study
Dara BahriYi TayChe ZhengDonald MetzlerCliff BrunkAndrew Tomkins
2020-08-17
Narrative Interpolation for Generating and Understanding Stories
Su WangGreg DurrettKatrin Erk
2020-08-17
Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size
Davis YoshidaAllyson EttingerKevin Gimpel
2020-08-16
Language Models as Few-Shot Learner for Task-Oriented Dialogue Systems
Andrea MadottoZihan LiuZhaojiang LinPascale Fung
2020-08-14
Navigating Language Models with Synthetic Agents
Philip Feldman
2020-08-10
Trojaning Language Models for Fun and Profit
Xinyang ZhangZheng ZhangTing Wang
2020-08-01
Multi-node Bert-pretraining: Cost-efficient Approach
Jiahuang LinXin LiGennady Pekhimenko
2020-08-01
TweepFake: about Detecting Deepfake Tweets
Tiziano FagniFabrizio FalchiMargherita GambiniAntonio MartellaMaurizio Tesconi
2020-07-31
Composer Style Classification of Piano Sheet Music Images Using Language Model Pretraining
TJ TsaiKevin Ji
2020-07-29
Generative Pretraining from Pixels
| Mark ChenAlec RadfordRewon ChildJeff WuHeewoo JunPrafulla DhariwalDavid LuanIlya Sutskever
2020-07-17
Deep Transformer based Data Augmentation with Subword Units for Morphologically Rich Online ASR
Balázs TarjánGyörgy SzaszákTibor FegyóPéter Mihajlik
2020-07-14
The Go Transformer: Natural Language Modeling for Game Play
Matthew CiolinoDavid NoeverJosh Kalin
2020-07-07
You Autocomplete Me: Poisoning Vulnerabilities in Neural Code Completion
Roei SchusterCongzheng SongEran TromerVitaly Shmatikov
2020-07-05
On-The-Fly Information Retrieval Augmentation for Language Models
Hai WangDavid McAllester
2020-07-03
Towards Holistic and Automatic Evaluation of Open-Domain Dialogue Generation
Bo PangErik NijkampWenjuan HanLinqi ZhouYixian LiuKewei Tu
2020-07-01
On-The-Fly Information Retrieval Augmentation for Language Models
Hai WangDavid McAllester
2020-07-01
Roles and Utilization of Attention Heads in Transformer-based Neural Language Models
Jae-young JoSung-Hyon Myaeng
2020-07-01
LSTM and GPT-2 Synthetic Speech Transfer Learning for Speaker Recognition to Overcome Data Scarcity
Jordan J. BirdDiego R. FariaAnikó EkártCristiano PremebidaPedro P. S. Ayrosa
2020-07-01
The Summary Loop: Learning to Write Abstractive Summaries Without Examples
| Philippe LabanAndrew HsiJohn CannyMarti A. Hearst
2020-07-01
Knowledge-Aware Language Model Pretraining
Corby RossetChenyan XiongMinh PhanXia SongPaul BennettSaurabh Tiwary
2020-06-29
Progressive Generation of Long Text
| Bowen TanZichao YangMaruan AI-ShedivatEric P. XingZhiting Hu
2020-06-28
Video-Grounded Dialogues with Pretrained Generation Language Models
Hung LeSteven C. H. Hoi
2020-06-27
A Qualitative Evaluation of Language Models on Automatic Question-Answering for COVID-19
| David OnianiYanshan Wang
2020-06-19
Unsupervised Paraphrase Generation using Pre-trained Language Models
Chaitra HegdeShrikumar Patil
2020-06-09
Few-Shot Generative Conversational Query Rewriting
| Shi YuJiahua LiuJingqin YangChenyan XiongPaul BennettJianfeng GaoZhiyuan Liu
2020-06-09
Automatic Text Summarization of COVID-19 Medical Research Articles using BERT and GPT-2
| Virapat KieuvongngamBowen TanYiming Niu
2020-06-03
Emergence of Separable Manifolds in Deep Language Representations
Jonathan MamouHang LeMiguel Del RioCory StephensonHanlin TangYoon KimSueYeon Chung
2020-06-01
First Neural Conjecturing Datasets and Experiments
Josef UrbanJan Jakubův
2020-05-29
Creative Artificial Intelligence -- Algorithms vs. humans in an incentivized writing competition
Nils KöbisLuca Mossink
2020-05-20
Large Scale Multi-Actor Generative Dialog Modeling
Alex BoydRaul PuriMohammad ShoeybiMostofa PatwaryBryan Catanzaro
2020-05-13
On the Generation of Medical Dialogues for COVID-19
| Wenmian YangGuangtao ZengBowen TanZeqian JuSubrato ChakravortyXuehai HeShu ChenXingyi YangQingyang WuZhou YuEric XingPengtao Xie
2020-05-11
Spying on your neighbors: Fine-grained probing of contextual embeddings for information about surrounding words
Josef KlafkaAllyson Ettinger
2020-05-04
Distributional Discrepancy: A Metric for Unconditional Text Generation
| Ping CaiXingyuan ChenPeng JinHongjun WangTianrui Li
2020-05-04
Transformer-based End-to-End Question Generation
| Luis Enrico LopezDiane Kathryn CruzJan Christian Blaise CruzCharibeth Cheng
2020-05-03
A Simple Language Model for Task-Oriented Dialogue
| Ehsan Hosseini-AslBryan McCannChien-Sheng WuSemih YavuzRichard Socher
2020-05-02
A Controllable Model of Grounded Response Generation
Zeqiu WuMichel GalleyChris BrockettYizhe ZhangXiang GaoChris QuirkRik Koncel-KedziorskiJianfeng GaoHannaneh HajishirziMari OstendorfBill Dolan
2020-05-01
POINTER: Constrained Text Generation via Insertion-based Generative Pre-training
| Yizhe ZhangGuoyin WangChunyuan LiZhe GanChris BrockettBill Dolan
2020-05-01
Text Categorization for Conflict Event Annotation
Fredrik OlssonMagnus SahlgrenFehmi ben AbdesslemAriel EkgrenKristine Eck
2020-05-01
Offensive language detection in Arabic using ULMFiT
Mohamed AbdellatifAhmed Elgammal
2020-05-01
Evaluation Metrics for Headline Generation Using Deep Pre-Trained Embeddings
Abdul MoeedYang AnGerhard HagererGeorg Groh
2020-05-01
Multilingual Corpus Creation for Multilingual Semantic Similarity Task
Mahtab AhmedChahna DixitRobert E. MercerAtif KhanMuhammad Rifayat SameeFelipe Urra
2020-05-01
PlotMachines: Outline-Conditioned Generation with Dynamic Plot State Tracking
Hannah RashkinAsli CelikyilmazYejin ChoiJianfeng Gao
2020-04-30
GePpeTto Carves Italian into a Language Model
| Lorenzo De MatteiMichele CafagnaFelice Dell'OrlettaMalvina NissimMarco Guerini
2020-04-29
LightPAFF: A Two-Stage Distillation Framework for Pre-training and Fine-tuning
Kaitao SongHao SunXu TanTao QinJianfeng LuHongzhi LiuTie-Yan Liu
2020-04-27
Assessing Discourse Relations in Language Generation from Pre-trained Language Models
Wei-Jen KoJunyi Jessy Li
2020-04-26
A Tailored Pre-Training Model for Task-Oriented Dialog Generation
Jing GuQingyang WuChongruo WuWeiyan ShiZhou Yu
2020-04-24
Mirror Ritual: An Affective Interface for Emotional Self-Reflection
Nina RajcicJon McCormack
2020-04-21
StereoSet: Measuring stereotypical bias in pretrained language models
| Moin NadeemAnna BethkeSiva Reddy
2020-04-20
Generating Counter Narratives against Online Hate Speech: Data and Strategies
Serra Sinem TekirogluYi-Ling ChungMarco Guerini
2020-04-08
Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity
Hamza HarkousIsabel GrovesAmir Saffari
2020-04-08
TextGAIL: Generative Adversarial Imitation Learning for Text Generation
Qingyang WuLei LiZhou Yu
2020-04-07
DARE: Data Augmented Relation Extraction with GPT-2
Yannis PapanikolaouAndrea Pierleoni
2020-04-06
Sparse Text Generation
Pedro Henrique MartinsZita MarinhoAndré F. T. Martins
2020-04-06
Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space
| Chunyuan LiXiang GaoYuan LiXiujun LiBaolin PengYizhe ZhangJianfeng Gao
2020-04-05
Generating Rationales in Visual Question Answering
Hammad A. AyyubiMd. Mehrab TanjimJulian J. McAuleyGarrison W. Cottrell
2020-04-04
Deep Transfer Learning for Texture Classification in Colorectal Cancer Histology
Srinath JayachandranAshlin Ghosh
2020-04-03
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
| Kevin ClarkMinh-Thang LuongQuoc V. LeChristopher D. Manning
2020-03-23
Generating Major Types of Chinese Classical Poetry in a Uniformed Framework
Jinyi HuMaosong Sun
2020-03-13
RecipeGPT: Generative Pre-training Based Cooking Recipe Generation and Evaluation System
| Helena H. LeeKe ShuPalakorn AchananuparpPhilips Kokoh PrasetyoYue LiuEe-Peng LimLav R. Varshney
2020-03-05
Hybrid Generative-Retrieval Transformers for Dialogue Domain Adaptation
Igor ShalyminovAlessandro SordoniAdam AtkinsonHannes Schulz
2020-03-03
Inferring the source of official texts: can SVM beat ULMFiT?
| Pedro Henrique Luz de AraujoTeófilo Emidio de CamposMarcelo Magalhães Silva de Sousa
2020-03-02
Training Question Answering Models From Synthetic Data
Raul PuriRyan SpringMostofa PatwaryMohammad ShoeybiBryan Catanzaro
2020-02-22
Transformer on a Diet
| Chenguang WangZihao YeAston ZhangZheng ZhangAlexander J. Smola
2020-02-14
Training Large Neural Networks with Constant Memory using a New Execution Algorithm
Bharadwaj PudipeddiMaral MesmakhosroshahiJinwen XiSujeeth Bharadwaj
2020-02-13
CBAG: Conditional Biomedical Abstract Generation
Justin SybrandtIlya Safro
2020-02-13
Localized Flood DetectionWith Minimal Labeled Social Media Data Using Transfer Learning
Neha SinghNirmalya RoyAryya Gangopadhyay
2020-02-10
Introducing Aspects of Creativity in Automatic Poetry Generation
Brendan BenaJugal Kalita
2020-02-06
Joint Contextual Modeling for ASR Correction and Language Understanding
Yue WengSai Sumanth MiryalaChandra KhatriRunze WangHuaixiu ZhengPiero MolinoMahdi NamazifarAlexandros PapangelisHugh WilliamsFranziska BellGokhan Tur
2020-01-28
Fine-Tuning a Transformer-Based Language Model to Avoid Generating Non-Normative Text
Xiangyu PengSiyan LiSpencer FrazierMark Riedl
2020-01-23
PatentTransformer-2: Controlling Patent Text Generation by Structural Metadata
Jieh-Sheng LeeJieh Hsiang
2020-01-11
Natural language processing of MIMIC-III clinical notes for identifying diagnosis and procedures with neural networks
Siddhartha NuthakkiSunil NeelaJudy W. GichoyaSaptarshi Purkayastha
2019-12-28
Personalized Patent Claim Generation and Measurement
Jieh-Sheng Lee
2019-12-07
A Comparative Study of Pretrained Language Models on Thai Social Text Categorization
Thanapapas HorsuwanKasidis KanwatcharaPeerapon VateekulBoonserm Kijsirikul
2019-12-03
Evaluating Commonsense in Pre-trained Language Models
| Xuhui ZhouYue ZhangLeyang CuiDandan Huang
2019-11-27
Paraphrasing with Large Language Models
Sam WitteveenMartin Andrews
2019-11-21
Unsupervised Natural Question Answering with a Small Model
Martin AndrewsSam Witteveen
2019-11-19
Attending to Entities for Better Text Understanding
Pengxiang ChengKatrin Erk
2019-11-11
INSET: Sentence Infilling with INter-SEntential Transformer
Yichen HuangYizhe ZhangOussama ElachqarYu Cheng
2019-11-10
Zero-Shot Paraphrase Generation with Multilingual Language Models
Yinpeng GuoYi LiaoXin JiangQing ZhangYibo ZhangQun Liu
2019-11-09
Grounded Conversation Generation as Guided Traverses in Commonsense Knowledge Graphs
| Houyu ZhangZhenghao LiuChenyan XiongZhiyuan Liu
2019-11-07
Learning to Answer by Learning to Ask: Getting the Best of GPT-2 and BERT Worlds
Tassilo KleinMoin Nabi
2019-11-06
Assessing Social and Intersectional Biases in Contextualized Word Representations
Yi Chern TanL. Elisa Celis
2019-11-04
Inspecting Unification of Encoding and Matching with Transformer: A Case Study of Machine Reading Comprehension
Hangbo BaoLi DongFuru WeiWenhui WangNan YangLei CuiSonghao PiaoMing Zhou
2019-11-01
Selecting, Planning, and Rewriting: A Modular Approach for Data-to-Document Generation and Translation
Lesly MiculicichMarc MaroneHany Hassan
2019-11-01
GEM: Generative Enhanced Model for adversarial attacks
Piotr NiewinskiMaria PszonaMaria Janicka
2019-11-01
Natural Language Generation for Effective Knowledge Distillation
Raphael TangYao LuJimmy Lin
2019-11-01
Masked Language Model Scoring
| Julian SalazarDavis LiangToan Q. NguyenKatrin Kirchhoff
2019-10-31
An Empirical Study of Efficient ASR Rescoring with Transformers
Hongzhao HuangFuchun Peng
2019-10-24
Evolution of transfer learning in natural language processing
Aditya MaltePratik Ratadiya
2019-10-16
Q8BERT: Quantized 8Bit BERT
| Ofir ZafrirGuy BoudoukhPeter IzsakMoshe Wasserblat
2019-10-14
Multilingual Question Answering from Formatted Text applied to Conversational Agents
| Wissam SibliniCharlotte PasqualAxel LavielleCyril Cauchois
2019-10-10
Alternating Roles Dialog Model with Large-scale Pre-trained Language Models
| Qingyang WuYichi ZhangYu LiZhou Yu
2019-10-09
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
| Samyam RajbhandariJeff RasleyOlatunji RuwaseYuxiong He
2019-10-04
Towards Understanding of Medical Randomized Controlled Trials by Conclusion Generation
Alexander Te-Wei ShiehYung-Sung ChuangShang-Yu SuYun-Nung Chen
2019-10-03
The merits of Universal Language Model Fine-tuning for Small Datasets -- a case with Dutch book reviews
Benjamin van der BurghSuzan Verberne
2019-10-02
TMLab: Generative Enhanced Model (GEM) for adversarial attacks
Piotr NiewinskiMaria PszonaMaria Janicka
2019-10-01
Extreme Language Model Compression with Optimal Subwords and Shared Projections
Sanqiang ZhaoRaghav GuptaYang SongDenny Zhou
2019-09-25
How Additional Knowledge can Improve Natural Language Commonsense Question Answering?
Arindam MitraPratyay BanerjeeKuntal Kumar PalSwaroop MishraChitta Baral
2019-09-19
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
| Mohammad ShoeybiMostofa PatwaryRaul PuriPatrick LeGresleyJared CasperBryan Catanzaro
2019-09-17
Reasoning Over Semantic-Level Graph for Fact Checking
| Wanjun ZhongJingjing XuDuyu TangZenan XuNan DuanMing ZhouJiahai WangJian Yin
2019-09-09
Semantics-aware BERT for Language Understanding
| Zhuosheng ZhangYuwei WuHai ZhaoZuchao LiShuailiang ZhangXi ZhouXiang Zhou
2019-09-05
Effective Use of Transformer Networks for Entity Tracking
Aditya GuptaGreg Durrett
2019-09-05
How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings
Kawin Ethayarajh
2019-09-02
Quantity doesn't buy quality syntax with neural language models
Marten van SchijndelAaron MuellerTal Linzen
2019-08-31
Adversarial Learning with Contextual Embeddings for Zero-resource Cross-lingual Classification and NER
Phillip KeungYichao LuVikas Bhardwaj
2019-08-31
Neural Language Model for Automated Classification of Electronic Medical Records at the Emergency Room. The Significant Benefit of Unsupervised Generative Pre-training
Binbin XuCédric Gil-JardinéFrantz ThiessardEric TellierMarta AvalosEmmanuel Lagarde
2019-08-30
Analyzing Customer Feedback for Product Fit Prediction
Stephan Baier
2019-08-28
Measuring Patent Claim Generation by Span Relevancy
Jieh-Sheng LeeJieh Hsiang
2019-08-26
Release Strategies and the Social Impacts of Language Models
Irene SolaimanMiles BrundageJack ClarkAmanda AskellAriel Herbert-VossJeff WuAlec RadfordGretchen KruegerJong Wook KimSarah KrepsMiles McCainAlex NewhouseJason BlazakisKris McGuffieJasmine Wang
2019-08-24
Universal Adversarial Triggers for Attacking and Analyzing NLP
| Eric WallaceShi FengNikhil KandpalMatt GardnerSameer Singh
2019-08-20
BioFLAIR: Pretrained Pooled Contextualized Embeddings for Biomedical Sequence Labeling Tasks
| Shreyas SharmaRon Daniel Jr
2019-08-13
Noisy Channel for Low Resource Grammatical Error Correction
Simon FlachsOph{\'e}lie LacroixAnders S{\o}gaard
2019-08-01
Leveraging Pre-trained Checkpoints for Sequence Generation Tasks
| Sascha RotheShashi NarayanAliaksei Severyn
2019-07-29
DLGNet: A Transformer-based Model for Dialogue Response Generation
Oluwatobi OlabiyiErik T. Mueller
2019-07-26
Generating Sentiment-Preserving Fake Online Reviews Using Neural Language Models and Their Human- and Machine-based Detection
David Ifeoluwa AdelaniHaotian MaiFuming FangHuy H. NguyenJunichi YamagishiIsao Echizen
2019-07-22
Low-Shot Classification: A Comparison of Classical and Deep Transfer Machine Learning Approaches
Peter UsherwoodSteven Smit
2019-07-17
Patent Claim Generation by Fine-Tuning OpenAI GPT-2
Jieh-Sheng LeeJieh Hsiang
2019-07-01
Evaluating Language Model Finetuning Techniques for Low-resource Languages
| Jan Christian Blaise CruzCharibeth Cheng
2019-06-30
GPT-based Generation for Classical Chinese Poetry
Yi LiaoYasheng WangQun LiuXin Jiang
2019-06-29
Fine-tuning Pre-Trained Transformer Language Models to Distantly Supervised Relation Extraction
| Christoph AltMarc HübnerLeonhard Hennig
2019-06-19
Exploiting Unsupervised Pre-training and Automated Feature Engineering for Low-resource Hate Speech Detection in Polish
Renard KorzeniowskiRafał RolczyńskiPrzemysław SadownikTomasz KorbakMarcin Możejko
2019-06-17
One Epoch Is All You Need
Aran Komatsuzaki
2019-06-16
A Multiscale Visualization of Attention in the Transformer Model
| Jesse Vig
2019-06-12
Analyzing the Structure of Attention in a Transformer Language Model
Jesse VigYonatan Belinkov
2019-06-07
Speak up, Fight Back! Detection of Social Media Disclosures of Sexual Harassment
| Arijit Ghosh ChowdhuryRamit SawhneyPuneet MathurDebanjan MahataRajiv Ratn Shah
2019-06-01
Figure Eight at SemEval-2019 Task 3: Ensemble of Transfer Learning Methods for Contextual Emotion Detection
Joan Xiao
2019-06-01
CODAH: An Adversarially-Authored Question Answering Dataset for Common Sense
Michael ChenMike D{'}ArcyAlisa LiuFernJared ezDoug Downey
2019-06-01
Story Ending Prediction by Transferable BERT
| Zhongyang LiXiao DingTing Liu
2019-05-17
Language Models with Transformers
| Chenguang WangMu LiAlexander J. Smola
2019-04-20
An Empirical Evaluation of Text Representation Schemes on Multilingual Social Web to Filter the Textual Aggression
Sandip ModhaPrasenjit Majumder
2019-04-16
[email protected] at SemEval-2019 Task 6 and Task 5: Linguistically enhanced deep learning offensive sentence classifier
Alessandro SegantiHelena SobolIryna OrlovaHannam KimJakub StaniszewskiTymoteusz KrumholcKrystian Koziel
2019-04-10
Visualizing Attention in Transformer-Based Language Representation Models
Jesse Vig
2019-04-04
Distilling Task-Specific Knowledge from BERT into Simple Neural Networks
| Raphael TangYao LuLinqing LiuLili MouOlga VechtomovaJimmy Lin
2019-03-28
Low Resource Text Classification with ULMFit and Backtranslation
Sam Shleifer
2019-03-21
Language Models are Unsupervised Multitask Learners
| Alec RadfordJeffrey WuRewon ChildDavid LuanDario AmodeiIlya Sutskever
2019-02-14
Passage Re-ranking with BERT
| Rodrigo NogueiraKyunghyun Cho
2019-01-13
Linguistic Analysis of Pretrained Sentence Encoders with Acceptability Judgments
Alex WarstadtSamuel R. Bowman
2019-01-11
Improving Language Understanding by Generative Pre-Training
| Alec RadfordKarthik NarasimhanTim SalimansIlya Sutskever
2018-06-11
Universal Language Model Fine-tuning for Text Classification
| Jeremy HowardSebastian Ruder
2018-01-18

Components

COMPONENT TYPE
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories