Entailment as Few-Shot Learner

29 Apr 2021  ·  Sinong Wang, Han Fang, Madian Khabsa, Hanzi Mao, Hao Ma ·

Large pre-trained language models (LMs) have demonstrated remarkable ability as few-shot learners. However, their success hinges largely on scaling model parameters to a degree that makes it challenging to train and serve. In this paper, we propose a new approach, named as EFL, that can turn small LMs into better few-shot learners. The key idea of this approach is to reformulate potential NLP task into an entailment one, and then fine-tune the model with as little as 8 examples. We further demonstrate our proposed method can be: (i) naturally combined with an unsupervised contrastive learning-based data augmentation method; (ii) easily extended to multilingual few-shot learning. A systematic evaluation on 18 standard NLP tasks demonstrates that this approach improves the various existing SOTA few-shot learning methods by 12\%, and yields competitive few-shot performance with 500 times larger models, such as GPT-3.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Question Answering BoolQ RoBERTa-large 355M + Entailment as Few-shot Learner Accuracy 86.0 # 14
Linguistic Acceptability CoLA RoBERTa-large 355M + Entailment as Few-shot Learner Accuracy 86.4% # 5
Sentiment Analysis CR RoBERTa-large 355M + Entailment as Few-shot Learner Accuracy 92.5 # 3
Sentiment Analysis IMDb RoBERTa-large 355M + Entailment as Few-shot Learner Accuracy 96.1 # 5
Sentiment Analysis MPQA RoBERTa-large 355M + Entailment as Few-shot Learner Accuracy 90.8 # 1
Sentiment Analysis MR RoBERTa-large 355M + Entailment as Few-shot Learner Accuracy 92.5 # 2
Semantic Textual Similarity MRPC RoBERTa-large 355M + Entailment as Few-shot Learner F1 91.0 # 8
Topic Classification OS RoBERTa-large 355M + Entailment as Few-shot Learner Accuracy 95.1 # 1
Natural Language Inference QNLI RoBERTa-large 355M + Entailment as Few-shot Learner Accuracy 94.5% # 15
Paraphrase Identification Quora Question Pairs RoBERTa-large 355M + Entailment as Few-shot Learner F1 89.2 # 2
Natural Language Inference RTE RoBERTa-large 355M + EFL + UCA Accuracy 87.2% # 21
Natural Language Inference RTE RoBERTa-large 355M + Entailment as Few-shot Learner Accuracy 90.5% # 15
Natural Language Inference SNLI EFL (Entailment as Few-shot Learner) + RoBERTa-large % Test Accuracy 93.1 # 1
% Train Accuracy ? # 74
Parameters 355m # 4
Natural Language Inference SNLI RoBERTa-large 355M + Entailment as Few-shot Learner % Test Accuracy 93.1 # 1
Parameters 355 # 1
Sentiment Analysis SST-2 Binary classification RoBERTa-large 355M + Entailment as Few-shot Learner Accuracy 96.9 # 8
Semantic Textual Similarity STS Benchmark RoBERTa-large 355M + Entailment as Few-shot Learner Pearson Correlation 0.918 # 11
Subjectivity Analysis SUBJ RoBERTa-large 355M + Entailment as Few-shot Learner Accuracy 97.1 # 3

Methods