Modeling Homophone Noise for Robust Neural Machine Translation

15 Dec 2020 · Wenjie Qin, Xiang Li, Yuhui Sun, Deyi Xiong, Jianwei Cui, Bin Wang ·

In this paper, we propose a robust neural machine translation (NMT) framework. The framework consists of a homophone noise detector and a syllable-aware NMT model to homophone errors. The detector identifies potential homophone errors in a textual sentence and converts them into syllables to form a mixed sequence that is then fed into the syllable-aware NMT. Extensive experiments on Chinese->English translation demonstrate that our proposed method not only significantly outperforms baselines on noisy test sets with homophone noise, but also achieves a substantial improvement on clean text.

PDF Abstract