BPE-Dropout: Simple and Effective Subword Regularization

ACL 2020 Ivan ProvilkovDmitrii EmelianenkoElena Voita

Subword segmentation is widely used to address the open vocabulary problem in machine translation. The dominant approach to subword segmentation is Byte Pair Encoding (BPE), which keeps the most frequent words intact while splitting the rare ones into multiple tokens... (read more)

PDF Abstract
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT BENCHMARK
Machine Translation IWSLT2015 English-Vietnamese Transformer+BPE-dropout BLEU 33.27 # 1

Methods used in the Paper