MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning

In sequence to sequence learning, the self-attention mechanism proves to be highly effective, and achieves significant improvements in many tasks. However, the self-attention mechanism is not without its own flaws... (read more)

PDF Abstract
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT BENCHMARK
Machine Translation IWSLT2014 German-English MUSE(Parallel Multi-scale Attention) BLEU score 36.3 # 5
Machine Translation WMT2014 English-French MUSE(Paralllel Multi-scale Attention) BLEU score 43.5 # 5
Machine Translation WMT2014 English-German MUSE(Parallel Multi-scale Attention) BLEU score 29.9 # 7

Methods used in the Paper