no code implementations • EMNLP (insights) 2021 • Jan Rosendahl, Christian Herold, Frithjof Petrick, Hermann Ney
In this work, we conduct a comprehensive investigation on one of the centerpieces of modern machine translation systems: the encoder-decoder attention mechanism.
no code implementations • Findings (ACL) 2022 • Christian Herold, Jan Rosendahl, Joris Vanvinckenroye, Hermann Ney
The filtering and/or selection of training data is one of the core aspects to be considered when building a strong machine translation system. In their influential work, Khayrallah and Koehn (2018) investigated the impact of different types of noise on the performance of machine translation systems. In the same year the WMT introduced a shared task on parallel corpus filtering, which went on to be repeated in the following years, and resulted in many different filtering approaches being proposed. In this work we aim to combine the recent achievements in data filtering with the original analysis of Khayrallah and Koehn (2018) and investigate whether state-of-the-art filtering systems are capable of removing all the suggested noise types. We observe that most of these types of noise can be detected with an accuracy of over 90% by modern filtering systems when operating in a well studied high resource setting. However, we also find that when confronted with more refined noise categories or when working with a less common language pair, the performance of the filtering systems is far from optimal, showing that there is still room for improvement in this area of research.
no code implementations • IWSLT (ACL) 2022 • Frithjof Petrick, Jan Rosendahl, Christian Herold, Hermann Ney
After its introduction the Transformer architecture quickly became the gold standard for the task of neural machine translation.
no code implementations • 18 Oct 2023 • Frithjof Petrick, Christian Herold, Pavel Petrushkov, Shahram Khadivi, Hermann Ney
Finally, we explore language model fusion in the light of recent advancements in large language models.
no code implementations • 8 Jun 2023 • Christian Herold, Hermann Ney
Document-level context for neural machine translation (NMT) is crucial to improve the translation consistency and cohesion, the translation of ambiguous inputs, as well as several other linguistic phenomena.
no code implementations • 8 Jun 2023 • Christian Herold, Yingbo Gao, Mohammad Zeineldeen, Hermann Ney
The integration of language models for neural machine translation has been extensively studied in the past.
no code implementations • 8 Jun 2023 • Christian Herold, Hermann Ney
On the other hand, in most works, the question on how to perform search with the trained model is scarcely discussed, sometimes not mentioned at all.
1 code implementation • 24 Oct 2022 • Viet Anh Khoa Tran, David Thulke, Yingbo Gao, Christian Herold, Hermann Ney
Currently, in speech translation, the straightforward approach - cascading a recognition system with a translation system - delivers state-of-the-art results.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 21 Oct 2022 • Yingbo Gao, Christian Herold, Zijian Yang, Hermann Ney
Checkpoint averaging is a simple and effective method to boost the performance of converged neural machine translation models.
no code implementations • 21 Oct 2022 • Yingbo Gao, Christian Herold, Zijian Yang, Hermann Ney
Encoder-decoder architecture is widely adopted for sequence-to-sequence modeling tasks.
no code implementations • 25 Nov 2021 • Matthias Perkonigg, Johannes Hofmanninger, Christian Herold, Helmut Prosch, Georg Langs
Here, we propose a method for continual active learning operating on a stream of medical images in a multi-scanner setting.
no code implementations • NAACL 2021 • Christian Herold, Jan Rosendahl, Joris Vanvinckenroye, Hermann Ney
While we find that our approaches come out at the top on all three tasks, different variants perform best on different tasks.
no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Yingbo Gao, Weiyue Wang, Christian Herold, Zijian Yang, Hermann Ney
In order to combat overfitting and in pursuit of better generalization, label smoothing is widely applied in modern neural machine translation systems.
no code implementations • WMT (EMNLP) 2020 • Jingjing Huo, Christian Herold, Yingbo Gao, Leonard Dahlmann, Shahram Khadivi, Hermann Ney
Context-aware neural machine translation (NMT) is a promising direction to improve the translation quality by making use of the additional context, e. g., document-level translation, or having meta-information.
no code implementations • 6 Jul 2020 • Johannes Hofmanninger, Matthias Perkonigg, James A. Brink, Oleg Pianykh, Christian Herold, Georg Langs
In medical imaging, technical progress or changes in diagnostic procedures lead to a continuous change in image appearance.
no code implementations • WS 2020 • Parnia Bahar, Patrick Wilken, Tamer Alkhouli, Andreas Guta, Pavel Golik, Evgeny Matusov, Christian Herold
AppTek and RWTH Aachen University team together to participate in the offline and simultaneous speech translation tracks of IWSLT 2020.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • EMNLP (IWSLT) 2019 • Yingbo Gao, Christian Herold, Weiyue Wang, Hermann Ney
Prominently used in support vector machines and logistic regressions, kernel functions (kernels) can implicitly map data points into high dimensional spaces and make it easier to learn complex decision boundaries.
no code implementations • WS 2019 • Jan Rosendahl, Christian Herold, Yunsu Kim, Miguel Gra{\c{c}}a, Weiyue Wang, Parnia Bahar, Yingbo Gao, Hermann Ney
For the De-En task, none of the tested methods gave a significant improvement over last years winning system and we end up with the same performance, resulting in 39. 6{\%} BLEU on newstest2019.
no code implementations • WS 2018 • Christian Herold, Yingbo Gao, Hermann Ney
Embedding and projection matrices are commonly used in neural language models (NLM) as well as in other sequence processing networks that operate on large vocabularies.