Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms

Many deep learning architectures have been proposed to model the compositionality in text sequences, requiring a substantial number of parameters and expensive computations. However, there has not been a rigorous evaluation regarding the added value of sophisticated compositional functions. In this paper, we conduct a point-by-point comparative study between Simple Word-Embedding-based Models (SWEMs), consisting of parameter-free pooling operations, relative to word-embedding-based RNN/CNN models. Surprisingly, SWEMs exhibit comparable or even superior performance in the majority of cases considered. Based upon this understanding, we propose two additional pooling strategies over learned word embeddings: (i) a max-pooling operation for improved interpretability; and (ii) a hierarchical pooling operation, which preserves spatial (n-gram) information within text sequences. We present experiments on 17 datasets encompassing three tasks: (i) (long) document classification; (ii) text sequence matching; and (iii) short text tasks, including classification and tagging. The source code and datasets can be obtained from https:// github.com/dinghanshen/SWEM.

PDF Abstract ACL 2018 PDF ACL 2018 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Text Classification AG News SWEM-concat Error 7.34 # 11
Named Entity Recognition (NER) CoNLL 2000 SWEM-CRF F1 90.34 # 1
Named Entity Recognition (NER) CoNLL 2003 (English) SWEM-CRF F1 86.28 # 71
Text Classification DBpedia SWEM-concat Error 1.43 # 19
Sentiment Analysis MR SWEM-concat Accuracy 78.2 # 12
Paraphrase Identification MSRP SWEM-concat Accuracy 71.5 # 3
F1 81.3 # 3
Natural Language Inference MultiNLI SWEM-max Matched 68.2 # 51
Mismatched 67.7 # 42
Question Answering Quora Question Pairs SWEM-concat Accuracy 83.03% # 17
Natural Language Inference SNLI SWEM-max % Test Accuracy 83.8 # 83
Sentiment Analysis SST-2 Binary classification SWEM-concat Accuracy 84.3 # 79
Sentiment Analysis SST-5 Fine-grained classification SWEM-concat Accuracy 46.1 # 23
Subjectivity Analysis SUBJ SWEM-concat Accuracy 93 # 13
Text Classification TREC-6 SWEM-aver Error 7.8 # 16
Question Answering WikiQA SWEM-concat MAP 0.6788 # 19
MRR 0.6908 # 19
Text Classification Yahoo! Answers SWEM-concat Accuracy 73.53 # 8
Sentiment Analysis Yelp Binary classification SWEM-hier Error 4.19 # 16
Sentiment Analysis Yelp Fine-grained classification SWEM-hier Error 36.21 # 15

Methods


No methods listed for this paper. Add relevant methods here