Search Results for author: Dominik Macháček

Found 14 papers, 3 papers with code

ELITR: European Live Translator

no code implementations • EAMT 2020 • Ondřej Bojar, Dominik Macháček, Sangeet Sagar, Otakar Smrž, Jonáš Kratochvíl, Ebrahim Ansari, Dario Franceschini, Chiara Canton, Ivan Simonini, Thai-Son Nguyen, Felix Schneider, Sebastian Stücker, Alex Waibel, Barry Haddow, Rico Sennrich, Philip Williams

ELITR (European Live Translator) project aims to create a speech translation system for simultaneous subtitling of conferences and online meetings targetting up to 43 languages.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Turning Whisper into Real-Time Transcription System

1 code implementation • 27 Jul 2023 • Dominik Macháček, Raj Dabre, Ondřej Bojar

Whisper is one of the recent state-of-the-art multilingual speech recognition and translation models, however, it is not designed for real time transcription.

speech-recognition Speech Recognition +1

1,130

Paper
Code

Robustness of Multi-Source MT to Transcription Errors

no code implementations • 26 May 2023 • Dominik Macháček, Peter Polák, Ondřej Bojar, Raj Dabre

Automatic speech translation is sensitive to speech recognition errors, but in a multilingual scenario, the same content may be available in various languages via simultaneous interpreting, dubbing or subtitling.

Machine Translation speech-recognition +2

Paper
Add Code

MT Metrics Correlate with Human Ratings of Simultaneous Speech Translation

1 code implementation • 16 Nov 2022 • Dominik Macháček, Ondřej Bojar, Raj Dabre

There have been several meta-evaluation studies on the correlation between human ratings and offline machine translation (MT) evaluation metrics such as BLEU, chrF2, BertScore and COMET.

Machine Translation Translation

Paper
Code

Comprehension of Subtitles from Re-Translating Simultaneous Speech Translation

no code implementations • 4 Mar 2022 • Dávid Javorský, Dominik Macháček, Ondřej Bojar

Our results show that the subtitling layout or flicker have a little effect on comprehension, in contrast to machine translation itself and individual competence.

Machine Translation Translation

Paper
Add Code

The Reality of Multi-Lingual Machine Translation

no code implementations • 25 Feb 2022 • Tom Kocmi, Dominik Macháček, Ondřej Bojar

Machine translation is for us a prime example of deep learning applications where human skills and learning capabilities are taken as a benchmark that many try to match and surpass.

Cross-Lingual Transfer Machine Translation +2

Paper
Add Code

Lost in Interpreting: Speech Translation from Source or Interpreter?

no code implementations • 17 Jun 2021 • Dominik Macháček, Matúš Žilinec, Ondřej Bojar

Interpreters facilitate multi-lingual meetings but the affordable set of languages is often smaller than what is needed.

Machine Translation Translation

Paper
Add Code

Presenting Simultaneous Translation in Limited Space

no code implementations • 18 Sep 2020 • Dominik Macháček, Ondřej Bojar

Furthermore, we propose a way how to estimate the overall usability of the combination of automatic translation and subtitling by measuring the quality, latency, and stability on a test set, and propose an improved measure for translation latency.

Translation

Paper
Add Code

ELITR Non-Native Speech Translation at IWSLT 2020

no code implementations • WS 2020 • Dominik Macháček, Jonáš Kratochvíl, Sangeet Sagar, Matúš Žilinec, Ondřej Bojar, Thai-Son Nguyen, Felix Schneider, Philip Williams, Yuekun Yao

This paper is an ELITR system submission for the non-native speech translation task at IWSLT 2020.

Translation

Paper
Add Code

Promoting the Knowledge of Source Syntax in Transformer NMT Is Not Needed

no code implementations • 24 Oct 2019 • Thuong-Hai Pham, Dominik Macháček, Ondřej Bojar

The data manipulation techniques, recommended in previous works, prove ineffective in large data settings.

Machine Translation Multi-Task Learning +3

Paper
Add Code

A Speech Test Set of Practice Business Presentations with Additional Relevant Texts

no code implementations • 2 Aug 2019 • Dominik Macháček, Jonáš Kratochvíl, Tereza Vojtěchová, Ondřej Bojar

We present a test corpus of audio recordings and transcriptions of presentations of students' enterprises together with their slides and web-pages.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

English-Czech Systems in WMT19: Document-Level Transformer

no code implementations • WS 2019 • Martin Popel, Dominik Macháček, Michal Auersperger, Ondřej Bojar, Pavel Pecina

We describe our NMT systems submitted to the WMT19 shared task in English-Czech news translation.

NMT Sentence +1

Paper
Add Code

CUNI Systems for the Unsupervised News Translation Task in WMT 2019

no code implementations • 29 Jul 2019 • Ivana Kvapilíková, Dominik Macháček, Ondřej Bojar

In this paper we describe the CUNI translation system used for the unsupervised news shared task of the ACL 2019 Fourth Conference on Machine Translation (WMT19).

Machine Translation Translation

Paper
Add Code

Morphological and Language-Agnostic Word Segmentation for NMT

1 code implementation • 14 Jun 2018 • Dominik Macháček, Jonáš Vidra, Ondřej Bojar

The state of the art of handling rich morphology in neural machine translation (NMT) is to break word forms into subword units, so that the overall vocabulary size of these units fits the practical limits given by the NMT model and GPU memory capacity.

Machine Translation NMT +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.