SRPOL's System for the IWSLT 2020 End-to-End Speech Translation Task
We took part in the offline End-to-End English to German TED lectures translation task. We based our solution on our last year{'}s submission. We used a slightly altered Transformer architecture with ResNet-like convolutional layer preparing the audio input to Transformer encoder. To improve the model{'}s quality of translation we introduced two regularization techniques and trained on machine translated Librispeech corpus in addition to iwslt-corpus, TEDLIUM2 andMust{\_}C corpora. Our best model scored almost 3 BLEU higher than last year{'}s model. To segment 2020 test set we used exactly the same procedure as last year.
PDF Abstract