Long Warm-up and Self-Training: Training Strategies of NICT-2 NMT System at WAT-2019
This paper describes the NICT-2 neural machine translation system at the 6th Workshop on Asian Translation. This system employs the standard Transformer model but features the following two characteristics. One is the long warm-up strategy, which performs a longer warm-up of the learning rate at the start of the training than conventional approaches. Another is that the system introduces self-training approaches based on multiple back-translations generated by sampling. We participated in three tasks{---}ASPEC.en-ja, ASPEC.ja-en, and TDDC.ja-en{---}using this system.
PDF Abstract