Coreference Resolution in Full Text Articles with BERT and Syntax-based Mention Filtering

WS 2019 · Hai-Long Trieu, Anh-Khoa Duong Nguyen, Nhung Nguyen, Makoto Miwa, Hiroya Takamura, Sophia Ananiadou ·

This paper describes our system developed for the coreference resolution task of the CRAFT Shared Tasks 2019. The CRAFT corpus is more challenging than other existing corpora because it contains full text articles. We have employed an existing span-based state-of-theart neural coreference resolution system as a baseline system. We enhance the system with two different techniques to capture longdistance coreferent pairs. Firstly, we filter noisy mentions based on parse trees with increasing the number of antecedent candidates. Secondly, instead of relying on the LSTMs, we integrate the highly expressive language model{--}BERT into our model. Experimental results show that our proposed systems significantly outperform the baseline. The best performing system obtained F-scores of 44{\%}, 48{\%}, 39{\%}, 49{\%}, 40{\%}, and 57{\%} on the test set with B3, BLANC, CEAFE, CEAFM, LEA, and MUC metrics, respectively. Additionally, the proposed model is able to detect coreferent pairs in long distances, even with a distance of more than 200 sentences.

PDF Abstract