no code implementations • 25 Dec 2022 • Zhengxin Yang, Wanling Gao, Chunjie Luo, Lei Wang, Fei Tang, Xu Wen, Jianfeng Zhan
The study unveils a counterintuitive revelation: deep learning inference quality exhibits fluctuations due to inference time.
no code implementations • ACL 2021 • Yang Feng, Shuhao Gu, Dengji Guo, Zhengxin Yang, Chenze Shao
Meanwhile, we force the conventional decoder to simulate the behaviors of the seer decoder via knowledge distillation.
no code implementations • 5 May 2021 • Zhengxin Yang
Besides, with the competence of the full-sentence model to encode the whole sentence, our decoding strategy can enhance the information maintained in the decoded states in real time.
1 code implementation • 30 Nov 2019 • Yang Feng, Wanying Xie, Shuhao Gu, Chenze Shao, Wen Zhang, Zhengxin Yang, Dong Yu
Neural machine translation models usually adopt the teacher forcing strategy for training which requires the predicted sequence matches ground truth word by word and forces the probability of each prediction to approach a 0-1 distribution.
no code implementations • IJCNLP 2019 • Zhengxin Yang, Jinchao Zhang, Fandong Meng, Shuhao Gu, Yang Feng, Jie zhou
Context modeling is essential to generate coherent and consistent translation for Document-level Neural Machine Translations.