no code implementations • 6 May 2023 • Beiduo Chen, Shaohan Huang, Zihan Zhang, Wu Guo, ZhenHua Ling, Haizhen Huang, Furu Wei, Weiwei Deng, Qi Zhang
Besides, two self-correction courses are proposed to bridge the chasm between the two encoders by creating a "correction notebook" for secondary-supervision.
1 code implementation • 7 Dec 2022 • Jun-Yu Ma, Beiduo Chen, Jia-Chen Gu, Zhen-Hua Ling, Wu Guo, Quan Liu, Zhigang Chen, Cong Liu
In this study, a mixture of short-channel distillers (MSD) method is proposed to fully interact the rich hierarchical information in the teacher model and to transfer knowledge to the student model sufficiently and efficiently.
no code implementations • 17 May 2022 • Beiduo Chen, Wu Guo, Quan Liu, Kun Tao
Multilingual BERT (mBERT), a language model pre-trained on large multilingual corpora, has impressive zero-shot cross-lingual transfer capabilities and performs surprisingly well on zero-shot POS tagging and Named Entity Recognition (NER), as well as on cross-lingual model transfer.
1 code implementation • SemEval (NAACL) 2022 • Beiduo Chen, Jun-Yu Ma, Jiajun Qi, Wu Guo, Zhen-Hua Ling, Quan Liu
The proposed method is applied to several state-of-the-art Transformer-based NER models with a gazetteer built from Wikidata, and shows great generalization ability across them.
no code implementations • 26 Feb 2022 • Beiduo Chen, Wu Guo, Bin Gu, Quan Liu, Yongchao Wang
Cross-language pre-trained models such as multilingual BERT (mBERT) have achieved significant performance in various cross-lingual downstream NLP tasks.