BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant Supervision

28 Jun 2020  ·  Chen Liang, Yue Yu, Haoming Jiang, Siawpeng Er, Ruijia Wang, Tuo Zhao, Chao Zhang ·

We study the open-domain named entity recognition (NER) problem under distant supervision. The distant supervision, though does not require large amounts of manual annotations, yields highly incomplete and noisy distant labels via external knowledge bases. To address this challenge, we propose a new computational framework -- BOND, which leverages the power of pre-trained language models (e.g., BERT and RoBERTa) to improve the prediction performance of NER models. Specifically, we propose a two-stage training algorithm: In the first stage, we adapt the pre-trained language model to the NER tasks using the distant labels, which can significantly improve the recall and precision; In the second stage, we drop the distant labels, and propose a self-training approach to further improve the model performance. Thorough experiments on 5 benchmark datasets demonstrate the superiority of BOND over existing distantly supervised NER methods. The code and distantly labeled data have been released in https://github.com/cliang1453/BOND.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Weakly-Supervised Named Entity Recognition CoNLL03 BOND F1 81.48 # 1
Weakly-Supervised Named Entity Recognition Ontonotes v5 (English) BOND F1 68.35 # 1
Weakly-Supervised Named Entity Recognition Tweet BOND F1 48.01 # 1
Weakly-Supervised Named Entity Recognition Webpage BOND F1 65.74 # 1
Weakly-Supervised Named Entity Recognition Wikigold BOND F1 60.07 # 1

Methods