TMU NMT System with Automatic Post-Editing by Multi-Source Levenshtein Transformer for the Restricted Translation Task of WAT 2022

WAT 2022 · Seiichiro Kondo, Mamoru Komachi ·

In this paper, we describe our TMU English–Japanese systems submitted to the restricted translation task at WAT 2022 (Nakazawa et al., 2022). In this task, we translate an input sentence with the constraint that certain words or phrases (called restricted target vocabularies (RTVs)) should be contained in the output sentence. To satisfy this constraint, we address this task using a combination of two techniques. One is lexical-constraint-aware neural machine translation (LeCA) (Chen et al., 2020), which is a method of adding RTVs at the end of input sentences. The other is multi-source Levenshtein transformer (MSLevT) (Wan et al., 2020), which is a non-autoregressive method for automatic post-editing. Our system generates translations in two steps. First, we generate the translation using LeCA. Subsequently, we filter the sentences that do not satisfy the constraints and post-edit them with MSLevT. Our experimental results reveal that 100% of the RTVs can be included in the generated sentences while maintaining the translation quality of the LeCA model on both English to Japanese (En→Ja) and Japanese to English (Ja→En) tasks. Furthermore, the method used in previous studies requires an increase in the beam size to satisfy the constraints, which is computationally expensive. In contrast, the proposed method does not require a similar increase and can generate translations faster.

PDF Abstract