Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

arXiv 2019 Colin RaffelNoam ShazeerAdam RobertsKatherine LeeSharan NarangMichael MatenaYanqi ZhouWei LiPeter J. Liu

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice... (read more)

PDF Abstract

Results from the Paper


 Ranked #1 on Semantic Textual Similarity on STS Benchmark (using extra training data)

     Get a GitHub badge
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK USES EXTRA
TRAINING DATA
BENCHMARK
Linguistic Acceptability CoLA T5-11B Accuracy 70.8% # 1
Natural Language Inference MultiNLI T5-11B Matched 92.0 # 1
Sentiment Analysis SST-2 Binary classification T5-Small Accuracy 91.8 # 16
Sentiment Analysis SST-2 Binary classification T5-3B Accuracy 97.4 # 1
Sentiment Analysis SST-2 Binary classification T5-Base Accuracy 95.2 # 8
Sentiment Analysis SST-2 Binary classification T5-11B Accuracy 97.1 # 2
Semantic Textual Similarity STS Benchmark T5-11B Pearson Correlation 0.925 # 1

Methods used in the Paper


METHOD TYPE
🤖 No Methods Found Help the community by adding them if they're not listed; e.g. Deep Residual Learning for Image Recognition uses ResNet