no code implementations • WMT (EMNLP) 2021 • Wen Lai, Jindřich Libovický, Alexander Fraser
This paper describes the submission of LMU Munich to the WMT 2021 multilingual machine translation task for small track #1, which studies translation between 6 languages (Croatian, Hungarian, Estonian, Serbian, Macedonian, English) in 30 directions.
no code implementations • Findings (ACL) 2022 • Jindřich Libovický, Helmut Schmid, Alexander Fraser
We present a literature and empirical survey that critically assesses the state of the art in character-level modeling for machine translation (MT).
no code implementations • WMT (EMNLP) 2021 • Jindřich Libovický, Alexander Fraser
We present our submissions to the WMT21 shared task in Unsupervised and Very Low Resource machine translation between German and Upper Sorbian, German and Lower Sorbian, and Russian and Chuvash.
no code implementations • WMT (EMNLP) 2021 • Jindřich Libovický, Alexander Fraser
We present the findings of the WMT2021 Shared Tasks in Unsupervised MT and Very Low Resource Supervised MT.
no code implementations • WMT (EMNLP) 2020 • Jindřich Libovický, Viktor Hangya, Helmut Schmid, Alexander Fraser
We present our systems for the WMT20 Very Low Resource MT Task for translation between German and Upper Sorbian.
no code implementations • 10 Apr 2024 • Martin Popel, Lucie Poláková, Michal Novák, Jindřich Helcl, Jindřich Libovický, Pavel Straňák, Tomáš Krabač, Jaroslava Hlaváčová, Mariia Anisimova, Tereza Chlaňová
We present Charles Translator, a machine translation system between Ukrainian and Czech, developed as part of a society-wide effort to mitigate the impact of the Russian-Ukrainian war on individuals and society.
no code implementations • 9 Apr 2024 • Katharina Hämmerl, Jindřich Libovický, Alexander Fraser
Cross-lingual alignment, the meaningful similarity of representations across languages in multilingual language models, has been an active field of research in recent years.
no code implementations • 20 Mar 2024 • Adnan Al Ali, Jindřich Libovický
This case study focuses on the political biases of pre-trained encoders in Czech and compares them with a representative value survey.
no code implementations • 5 Mar 2024 • Philipp J. Rösch, Norbert Oswald, Michaela Geierhos, Jindřich Libovický
Current multimodal models leveraging contrastive learning often face limitations in developing fine-grained conceptual understanding.
no code implementations • 25 Oct 2023 • Jindřich Helcl, Jindřich Libovický
The goal of the shared task was to develop systems for named entity recognition and question answering in several under-represented languages.
1 code implementation • 20 Jul 2023 • Hynek Kydlíček, Jindřich Libovický
Pre-trained models for Czech Natural Language Processing are often evaluated on purely linguistic tasks (POS tagging, parsing, NER) and relatively simple classification tasks such as sentiment classification or article classification from a single news source.
1 code implementation • 1 Jun 2023 • Katharina Hämmerl, Alina Fastowski, Jindřich Libovický, Alexander Fraser
We investigate outlier dimensions and their relationship to anisotropy in multiple pre-trained multilingual language models.
no code implementations • 23 May 2023 • Jindřich Libovický
We study how multilingual sentence representations capture European countries and occupations and how this differs across European languages.
no code implementations • Findings (NAACL) 2022 • Philipp J. Rösch, Jindřich Libovický
Our results thus highlight an important issue of multimodal modeling: the mere presence of information detectable by a probing classifier is not a guarantee that the information is available in a cross-modal setup.
no code implementations • 1 Dec 2022 • Martin Popel, Jindřich Libovický, Jindřich Helcl
We present Charles University submissions to the WMT22 General Translation Shared Task on Czech-Ukrainian and Ukrainian-Czech machine translation.
1 code implementation • 14 Nov 2022 • Katharina Hämmerl, Björn Deiseroth, Patrick Schramowski, Jindřich Libovický, Constantin A. Rothkopf, Alexander Fraser, Kristian Kersting
Do the models capture moral norms from English and impose them on other languages?
no code implementations • 18 Mar 2022 • Katharina Hämmerl, Björn Deiseroth, Patrick Schramowski, Jindřich Libovický, Alexander Fraser, Kristian Kersting
Massively multilingual sentence representations are trained on large corpora of uncurated data, with a very imbalanced proportion of languages included in the training.
1 code implementation • Findings (ACL) 2022 • Katharina Hämmerl, Jindřich Libovický, Alexander Fraser
We combine the strengths of static and contextual models to improve multilingual representations.
1 code implementation • COLING 2022 • Wen Lai, Jindřich Libovický, Alexander Fraser
First, we want to reach domain robustness, i. e., we want to reach high quality on both domains seen in the training data and unseen domains.
no code implementations • 15 Oct 2021 • Jindřich Libovický, Helmut Schmid, Alexander Fraser
We present a literature and empirical survey that critically assesses the state of the art in character-level modeling for machine translation (MT).
1 code implementation • spnlp (ACL) 2022 • Jindřich Libovický, Alexander Fraser
We propose the neural string edit distance model for string-pair matching and string transduction based on learnable string edit distance.
2 code implementations • EMNLP 2020 • Jindřich Libovický, Alexander Fraser
Applying the Transformer architecture on the character level usually requires very deep architectures that are difficult and slow to train.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Jindřich Libovický, Rudolf Rosa, Alexander Fraser
Multilingual contextual embeddings, such as multilingual BERT and XLM-RoBERTa, have proved useful for many multi-lingual tasks.
no code implementations • 7 Apr 2020 • Zdeněk Kasner, Jindřich Libovický, Jindřich Helcl
Non-autoregressive (nAR) models for machine translation (MT) manifest superior decoding speed when compared to autoregressive (AR) models, at the expense of impaired fluency of their outputs.
1 code implementation • 8 Nov 2019 • Jindřich Libovický, Rudolf Rosa, Alexander Fraser
Multilingual BERT (mBERT) provides sentence representations for 104 languages, which are useful for many multi-lingual tasks.
no code implementations • 29 Aug 2019 • Jindřich Libovický, Pranava Madhyastha
In this paper, we present a meta-study assessing the representational quality of models where the training signal is obtained from different modalities, in particular, language modeling, image features prediction, and both textual and multimodal machine translation.
no code implementations • 10 Jul 2019 • Jindřich Libovický
Filters of convolutional networks used in computer vision are often visualized as image patches that maximize the response of the filter.
no code implementations • WS 2019 • Jindřich Helcl, Jindřich Libovický, Martin Popel
We present our submission to the WMT19 Robustness Task.
1 code implementation • 12 Nov 2018 • Jindřich Libovický, Jindřich Helcl
Autoregressive decoding is the only part of sequence-to-sequence models that prevents them from massive parallelization at inference time.
no code implementations • 12 Nov 2018 • Jindřich Helcl, Jindřich Libovický, Dušan Variš
For our submission, we acquired both textual and multimodal additional data.
no code implementations • 12 Nov 2018 • Jindřich Libovický, Jindřich Helcl, David Mareček
In multi-source sequence-to-sequence tasks, the attention mechanism can be modeled in several ways.
no code implementations • 14 Jul 2017 • Jindřich Helcl, Jindřich Libovický
For Task 1 (multimodal translation), our best scoring system is a purely textual neural translation of the source image caption to the target language.
1 code implementation • 21 Apr 2017 • Jindřich Libovický, Jindřich Helcl
Modeling attention in neural multi-source sequence-to-sequence learning remains a relatively unexplored area, despite its usefulness in tasks that incorporate multiple source languages or modalities.
no code implementations • WS 2016 • Jindřich Libovický, Jindřich Helcl, Marek Tlustý, Pavel Pecina, Ondřej Bojar
Neural sequence to sequence learning recently became a very promising paradigm in machine translation, achieving competitive results with statistical phrase-based systems.