Quasi Bidirectional Encoder Representations from Transformers for Word Sense Disambiguation

RANLP 2019 · Michele Bevilacqua, Roberto Navigli ·

While contextualized embeddings have produced performance breakthroughs in many Natural Language Processing (NLP) tasks, Word Sense Disambiguation (WSD) has not benefited from them yet. In this paper, we introduce QBERT, a Transformer-based architecture for contextualized embeddings which makes use of a co-attentive layer to produce more deeply bidirectional representations, better-fitting for the WSD task. As a result, we are able to train a WSD system that beats the state of the art on the concatenation of all evaluation datasets by over 3 points, also outperforming a comparable model using ELMo.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Word Sense Disambiguation

Datasets

Add Datasets introduced or used in this paper

Results from the Paper

Add Remove

Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods

Add Remove

BiLSTM • ELMo • LSTM • Sigmoid Activation • Softmax • Tanh Activation

Edit Social Preview

Quasi Bidirectional Encoder Representations from Transformers for Word Sense Disambiguation

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove