Lexical Semantic Recognition

In lexical semantics, full-sentence segmentation and segment labeling of various phenomena are generally treated separately, despite their interdependence. We hypothesize that a unified lexical semantic recognition task is an effective way to encapsulate previously disparate styles of annotation, including multiword expression identification / classification and supersense tagging. Using the STREUSLE corpus, we train a neural CRF sequence tagger and evaluate its performance along various axes of annotation. As the label set generalizes that of previous tasks (PARSEME, DiMSUM), we additionally evaluate how well the model generalizes to those test sets, finding that it approaches or surpasses existing models despite training only on STREUSLE. Our work also establishes baseline models and evaluation metrics for integrated and accurate modeling of lexical semantics, facilitating future work in this area.

PDF Abstract ACL (MWE) 2021 PDF ACL (MWE) 2021 Abstract

Datasets


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Natural Language Understanding STREUSLE BERT (pred POS/lemmas) Role F1 (Preps) 72.4 # 1
Function F1 (Preps) 82.8 # 1
Full F1 (Preps) 71.6 # 2
Tags (Full) Acc 82.5 # 1
Natural Language Understanding STREUSLE GloVe (none) Full F1 (Preps) 58.1 # 8
Tags (Full) Acc 77.5 # 5
Natural Language Understanding STREUSLE GloVe (pred POS/lemmas) Full F1 (Preps) 58.0 # 9
Tags (Full) Acc 77.1 # 6
Natural Language Understanding STREUSLE GloVe (gold POS/lemmas) Full F1 (Preps) 61.0 # 5
Tags (Full) Acc 79.3 # 4
Natural Language Understanding STREUSLE BERT (none) Role F1 (Preps) 71.9 # 3
Function F1 (Preps) 81.0 # 3
Full F1 (Preps) 70.9 # 4
Tags (Full) Acc 82.0 # 2
Natural Language Understanding STREUSLE BERT (gold POS/lemmas) Role F1 (Preps) 72.4 # 1
Function F1 (Preps) 81.7 # 2
Full F1 (Preps) 71.4 # 3
Tags (Full) Acc 81.0 # 3

Methods