BERT for Coreference Resolution: Baselines and Analysis

We apply BERT to coreference resolution, achieving strong improvements on the OntoNotes (+3.9 F1) and GAP (+11.5 F1) benchmarks. A qualitative analysis of model predictions indicates that, compared to ELMo and BERT-base, BERT-large is particularly better at distinguishing between related but distinct entities (e.g., President and CEO). However, there is still room for improvement in modeling document-level context, conversations, and mention paraphrasing. Our code and models are publicly available.

PDF Abstract IJCNLP 2019 PDF IJCNLP 2019 Abstract

Results from the Paper


Ranked #10 on Coreference Resolution on CoNLL 2012 (using extra training data)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Coreference Resolution CoNLL 2012 c2f-coref + BERT-large Avg F1 76.9 # 10
Coreference Resolution OntoNotes BERT-large F1 76.9 # 14
Coreference Resolution OntoNotes BERT-base F1 73.9 # 16

Methods