SenseBERT: Driving Some Sense into BERT

The ability to learn from large unlabeled corpora has allowed neural language models to advance the frontier in natural language understanding. However, existing self-supervision techniques operate at the word form level, which serves as a surrogate for the underlying semantic content. This paper proposes a method to employ weak-supervision directly at the word sense level. Our model, named SenseBERT, is pre-trained to predict not only the masked words but also their WordNet supersenses. Accordingly, we attain a lexical-semantic level language model, without the use of human annotation. SenseBERT achieves significantly improved lexical understanding, as we demonstrate by experimenting on SemEval Word Sense Disambiguation, and by attaining a state of the art result on the Word in Context task.

PDF Abstract ACL 2020 PDF ACL 2020 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Natural Language Inference QNLI SenseBERT-base 110M Accuracy 90.6% # 32
Natural Language Inference RTE SenseBERT-base 110M Accuracy 67.5% # 62
Word Sense Disambiguation Words in Context SenseBERT-large 340M Accuracy 72.1 # 11
Word Sense Disambiguation Words in Context SenseBERT-base 110M Accuracy 70.3 # 12

Methods


No methods listed for this paper. Add relevant methods here