Simplify the Usage of Lexicon in Chinese NER

ACL 2020  ·  Ruotian Ma, Minlong Peng, Qi Zhang, Xuanjing Huang ·

Recently, many works have tried to augment the performance of Chinese named entity recognition (NER) using word lexicons. As a representative, Lattice-LSTM (Zhang and Yang, 2018) has achieved new benchmark results on several public Chinese NER datasets. However, Lattice-LSTM has a complex model architecture. This limits its application in many industrial areas where real-time NER responses are needed. In this work, we propose a simple but effective method for incorporating the word lexicon into the character representations. This method avoids designing a complicated sequence modeling architecture, and for any neural NER model, it requires only subtle adjustment of the character representation layer to introduce the lexicon information. Experimental studies on four benchmark Chinese NER datasets show that our method achieves an inference speed up to 6.15 times faster than those of state-ofthe-art methods, along with a better performance. The experimental results also show that the proposed method can be easily incorporated with pre-trained models like BERT.

PDF Abstract ACL 2020 PDF ACL 2020 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Chinese Named Entity Recognition MSRA LSTM + Lexicon augment F1 93.5 # 16
Chinese Named Entity Recognition OntoNotes 4 LSTM + Lexicon augment F1 75.54 # 12
Chinese Named Entity Recognition Resume NER LSTM + Lexicon augment F1 95.59 # 8
Chinese Named Entity Recognition Weibo NER LSTM + Lexicon augment F1 61.24 # 13

Methods