TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	REMOVE
Machine Translation	IWSLT2014 English-German	MixedRepresentations	BLEU score	29.93	# 5
Machine Translation	IWSLT2014 German-English	MixedRepresentations	BLEU score	36.41	# 13

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/sequence-generation-with-mixed/machine-translation-on-iwslt2014-english)](https://paperswithcode.com/sota/machine-translation-on-iwslt2014-english?p=sequence-generation-with-mixed)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/sequence-generation-with-mixed/machine-translation-on-iwslt2014-german)](https://paperswithcode.com/sota/machine-translation-on-iwslt2014-german?p=sequence-generation-with-mixed)`

Sequence Generation with Mixed Representations

ICML 2020 · Lijun Wu Shufang Xie Yingce Xia Fan Yang Tao Qin Jianhuang Lai Tie-Yan Liu ·

Tokenization is the first step of many natural language processing (NLP) tasks and plays an important role for neural NLP models. Tokenizaton method such as byte-pair encoding (BPE), which can greatly reduce the large vocabulary and deal with out-of-vocabulary words, has shown to be effective and is widely adopted for sequence generation tasks. While various tokenization methods exist, there is no common acknowledgement which is the best. In this work, we propose to leverage the mixed representations from different tokenization methods for sequence generation tasks, in order to boost the model performance with unique characteristics and advantages of individual tokenization methods. Specifically, we introduce a new model architecture to incorporate mixed representations and a co-teaching algorithm to better utilize the diversity of different tokenization methods. Our approach achieves significant improvements on neural machine translation (NMT) tasks with six language pairs (e.g., English↔German, English↔Romanian), as well as an abstractive summarization task.

PDF Abstract ICML 2020 PDF

Code

Add Remove Mark official

apeterswu/fairseq_mix

Tasks

Add Remove

Abstractive Text Summarization

Machine Translation

NMT

Translation

Datasets

Add Datasets introduced or used in this paper

Results from the Paper

Add Remove

Ranked #5 on Machine Translation on IWSLT2014 English-German

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Machine Translation	IWSLT2014 English-German	MixedRepresentations	BLEU score	29.93	# 5	Compare
Machine Translation	IWSLT2014 German-English	MixedRepresentations	BLEU score	36.41	# 13	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • BPE • Dense Connections • Dropout • Label Smoothing • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer

Edit Social Preview

Sequence Generation with Mixed Representations

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove