Multiway Attention Networks for Modeling Sentence Pairs
Modeling sentence pairs plays the vital role for judging the relationship between two sentences, such as paraphrase identification, natural language inference, and answer sentence selection. Previous work achieves very promising results using neural networks with attention mechanism. In this paper, we propose the multiway attention networks which employ multiple attention functions to match sentence pairs under the matching-aggregation framework. Specifically, we design four attention functions to match words in corresponding sentences. Then, we aggregate the matching information from each function, and combine the information from all functions to obtain the final representation. Experimental results demonstrate that the proposed multiway attention networks improve the result on the Quora Question Pairs, SNLI, MultiNLI, and answer sentence selection task on the SQuAD dataset.
PDF AbstractResults from the Paper
Ranked #11 on Paraphrase Identification on Quora Question Pairs (Accuracy metric)
Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Benchmark |
---|---|---|---|---|---|---|
Paraphrase Identification | Quora Question Pairs | MwAN | Accuracy | 89.12 | # 11 | |
Natural Language Inference | SNLI | 150D Multiway Attention Network Ensemble | % Test Accuracy | 89.4 | # 16 | |
% Train Accuracy | 95.5 | # 9 | ||||
Parameters | 58m | # 4 | ||||
Natural Language Inference | SNLI | 150D Multiway Attention Network | % Test Accuracy | 88.3 | # 35 | |
% Train Accuracy | 94.5 | # 15 | ||||
Parameters | 14m | # 4 |