TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Sentiment Analysis	IMDb	BCN+Char+CoVe	Accuracy	91.8	# 31
Natural Language Inference	SNLI	Biattentive Classification Network + CoVe + Char	% Test Accuracy	88.1	# 36
Natural Language Inference	SNLI	Biattentive Classification Network + CoVe + Char	% Train Accuracy	88.5	# 54
Natural Language Inference	SNLI	Biattentive Classification Network + CoVe + Char	Parameters	22m	# 4
Question Answering	SQuAD1.1	DCN + Char + CoVe	EM	71.3	# 152
Question Answering	SQuAD1.1	DCN + Char + CoVe	F1	79.9	# 158
Question Answering	SQuAD1.1	DCN + Char + CoVe	Hardware Burden	None	# 1
Question Answering	SQuAD1.1	DCN + Char + CoVe	Operations per network pass	None	# 1
Question Answering	SQuAD1.1 dev	DCN (Char + CoVe)	EM	71.3	# 34
Question Answering	SQuAD1.1 dev	DCN (Char + CoVe)	F1	79.9	# 37
Sentiment Analysis	SST-2 Binary classification	BCN+Char+CoVe	Accuracy	90.3	# 61
Sentiment Analysis	SST-5 Fine-grained classification	BCN+Char+CoVe	Accuracy	53.7	# 9
Text Classification	TREC-6	CoVe	Error	4.2	# 9

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/learned-in-translation-contextualized-word/sentiment-analysis-on-sst-5-fine-grained)](https://paperswithcode.com/sota/sentiment-analysis-on-sst-5-fine-grained?p=learned-in-translation-contextualized-word)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/learned-in-translation-contextualized-word/text-classification-on-trec-6)](https://paperswithcode.com/sota/text-classification-on-trec-6?p=learned-in-translation-contextualized-word)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/learned-in-translation-contextualized-word/sentiment-analysis-on-imdb)](https://paperswithcode.com/sota/sentiment-analysis-on-imdb?p=learned-in-translation-contextualized-word)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/learned-in-translation-contextualized-word/question-answering-on-squad11-dev)](https://paperswithcode.com/sota/question-answering-on-squad11-dev?p=learned-in-translation-contextualized-word)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/learned-in-translation-contextualized-word/natural-language-inference-on-snli)](https://paperswithcode.com/sota/natural-language-inference-on-snli?p=learned-in-translation-contextualized-word)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/learned-in-translation-contextualized-word/sentiment-analysis-on-sst-2-binary)](https://paperswithcode.com/sota/sentiment-analysis-on-sst-2-binary?p=learned-in-translation-contextualized-word)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/learned-in-translation-contextualized-word/question-answering-on-squad11)](https://paperswithcode.com/sota/question-answering-on-squad11?p=learned-in-translation-contextualized-word)`

Learned in Translation: Contextualized Word Vectors

NeurIPS 2017 · Bryan McCann, James Bradbury, Caiming Xiong, Richard Socher ·

Computer vision has benefited from initializing multiple deep layers with weights pretrained on large supervised training sets like ImageNet. Natural language processing (NLP) typically sees initialization of only the lowest layer of deep models with pretrained word vectors. In this paper, we use a deep LSTM encoder from an attentional sequence-to-sequence model trained for machine translation (MT) to contextualize word vectors. We show that adding these context vectors (CoVe) improves performance over using only unsupervised word and character vectors on a wide variety of common NLP tasks: sentiment analysis (SST, IMDb), question classification (TREC), entailment (SNLI), and question answering (SQuAD). For fine-grained sentiment analysis and entailment, CoVe improves performance of our baseline models to the state of the art.

PDF Abstract NeurIPS 2017 PDF NeurIPS 2017 Abstract

Code

Add Remove Mark official

salesforce/cove official

470

adi2103/AML-CoVe

menajosep/AleatoricSent

richinkabra/CoVe-BCN

cove-adml/adml-anon

Tasks

Add Remove

General Classification

Machine Translation

Question Answering

Sentiment Analysis

Text Classification

Translation

Datasets

ImageNet

SST

SQuAD

IMDb Movie Reviews SST-2

SNLI SST-5

Results from the Paper

Edit

Ranked #9 on Text Classification on TREC-6

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Sentiment Analysis	IMDb	BCN+Char+CoVe	Accuracy	91.8	# 31	Compare
Natural Language Inference	SNLI	Biattentive Classification Network + CoVe + Char	% Test Accuracy	88.1	# 36	Compare
			% Train Accuracy	88.5	# 54	Compare
			Parameters	22m	# 4	Compare
Question Answering	SQuAD1.1	DCN + Char + CoVe	EM	71.3	# 152	Compare
			F1	79.9	# 158	Compare
			Hardware Burden	None	# 1	Compare
			Operations per network pass	None	# 1	Compare
Question Answering	SQuAD1.1 dev	DCN (Char + CoVe)	EM	71.3	# 34	Compare
Question Answering	SQuAD1.1 dev	DCN (Char + CoVe)	F1	79.9	# 37	Compare
Sentiment Analysis	SST-2 Binary classification	BCN+Char+CoVe	Accuracy	90.3	# 61	Compare
Sentiment Analysis	SST-5 Fine-grained classification	BCN+Char+CoVe	Accuracy	53.7	# 9	Compare
Text Classification	TREC-6	CoVe	Error	4.2	# 9	Compare

Methods

Add Remove

BiLSTM • CoVe • GloVe • Location-based Attention • LSTM • Seq2Seq • Sigmoid Activation • Softmax • Tanh Activation

Edit Social Preview

Learned in Translation: Contextualized Word Vectors

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove