Supervised Learning of Universal Sentence Representations from Natural Language Inference Data

Many modern NLP systems rely on word embeddings, previously trained in an unsupervised manner on large corpora, as base features. Efforts to obtain embeddings for larger chunks of text, such as sentences, have however not been so successful. Several attempts at learning unsupervised representations of sentences have not reached satisfactory enough performance to be widely adopted. In this paper, we show how universal sentence representations trained using the supervised data of the Stanford Natural Language Inference datasets can consistently outperform unsupervised methods like SkipThought vectors on a wide range of transfer tasks. Much like how computer vision uses ImageNet to obtain features, which can then be transferred to other tasks, our work tends to indicate the suitability of natural language inference for transfer learning to other NLP tasks. Our encoder is publicly available.

PDF Abstract EMNLP 2017 PDF EMNLP 2017 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Semantic Textual Similarity MRPC InferSent Accuracy 76.2% # 36
F1 83.1% # 16
Semantic Textual Similarity SentEval InferSent MRPC 76.2/83.1 # 1
SICK-R 0.884 # 2
SICK-E 86.3 # 2
STS 75.8/75.5 # 1
Natural Language Inference SNLI 4096D BiLSTM with max-pooling % Test Accuracy 84.5 # 79
% Train Accuracy 85.6 # 65
Parameters 40m # 4
Cross-Lingual Natural Language Inference XNLI Zero-Shot English-to-French X-BiLSTM Accuracy 67.7% # 2
Cross-Lingual Natural Language Inference XNLI Zero-Shot English-to-French X-CBOW Accuracy 60.3% # 3
Cross-Lingual Natural Language Inference XNLI Zero-Shot English-to-German X-CBOW Accuracy 61.0% # 4
Cross-Lingual Natural Language Inference XNLI Zero-Shot English-to-German X-BiLSTM Accuracy 67.7% # 3
Cross-Lingual Natural Language Inference XNLI Zero-Shot English-to-Spanish X-BiLSTM Accuracy 68.7% # 3
Cross-Lingual Natural Language Inference XNLI Zero-Shot English-to-Spanish X-CBOW Accuracy 60.7% # 4

Methods


No methods listed for this paper. Add relevant methods here