XLNet: Generalized Autoregressive Pretraining for Language Understanding

NeurIPS 2019 Zhilin YangZihang DaiYiming YangJaime CarbonellRuslan SalakhutdinovQuoc V. Le

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling. However, relying on corrupting the input with masks, BERT neglects dependency between the masked positions and suffers from a pretrain-finetune discrepancy... (read more)

PDF Abstract

Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT BENCHMARK
Text Classification AG News XLNet Error 4.45 # 1
Text Classification Amazon-2 XLNet Error 2.11 # 1
Text Classification Amazon-5 XLNet Error 31.67 # 1
Document Ranking ClueWeb09-B XLNet [email protected] 31.10 # 1
[email protected] 20.28 # 1
Linguistic Acceptability CoLA XLNet (single model) Accuracy 69% # 3
Text Classification DBpedia XLNet Error 0.6 # 1
Text Classification IMDb XLNet Accuracy 96.8 # 1
Semantic Textual Similarity MRPC XLNet (single model) Accuracy 90.8% # 5
Natural Language Inference MultiNLI XLNet (single model) Matched 90.8 # 4
Natural Language Inference QNLI XLNet (single model) Accuracy 94.9% # 6
Question Answering Quora Question Pairs XLNet (single model) Accuracy 92.3% # 1
Reading Comprehension RACE XLNet Accuracy 85.4 # 5
Accuracy (High) 84.0 # 5
Accuracy (Middle) 88.6 # 4
Natural Language Inference RTE XLNet (single model) Accuracy 85.9% # 6
Question Answering SQuAD1.1 XLNet (single model) EM 89.898 # 2
F1 95.080 # 2
Question Answering SQuAD1.1 dev XLNet (single model) EM 89.7 # 3
F1 95.1 # 3
Question Answering SQuAD2.0 XLNet (single model) EM 87.926 # 41
F1 90.689 # 46
Question Answering SQuAD2.0 dev XLNet (single model) F1 90.6 # 1
EM 87.9 # 1
Sentiment Analysis SST-2 Binary classification XLNet (single model) Accuracy 97 # 3
Semantic Textual Similarity STS Benchmark XLNet (single model) Pearson Correlation 0.925 # 1
Text Classification Yelp-2 XLNet Accuracy 98.63% # 1
Text Classification Yelp-5 XLNet Accuracy 72.95% # 2

Methods used in the Paper