Sequence Generation with Mixed Representations

ICML 2020 Lijun Wu Shufang Xie Yingce Xia Fan Yang Tao Qin Jianhuang Lai Tie-Yan Liu

Tokenization is the first step of many natural language processing (NLP) tasks and plays an important role for neural NLP models. Tokenizaton method such as byte-pair encoding (BPE), which can greatly reduce the large vocabulary and deal with out-of-vocabulary words, has shown to be effective and is widely adopted for sequence generation tasks... (read more)

PDF Abstract ICML 2020 PDF
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK BENCHMARK
Machine Translation IWSLT2014 English-German MixedRepresentations BLEU score 29.93 # 1
Machine Translation IWSLT2014 German-English MixedRepresentations BLEU score 36.41 # 2

Methods used in the Paper