Low-Resource Neural Machine Translation
23 papers with code • 1 benchmarks • 4 datasets
Low-resource machine translation is the task of machine translation on a low-resource language where large data may not be available.
Latest papers with no code
Extremely low-resource machine translation for closely related languages
An effective method to improve extremely low-resource neural machine translation is multilingual training, which can be improved by leveraging monolingual data to create synthetic bilingual corpora using the back-translation method.
Data Augmentation for Sign Language Gloss Translation
Sign language translation (SLT) is often decomposed into video-to-gloss recognition and gloss-to-text translation, where a gloss is a sequence of transcribed spoken-language words in the order in which they are signed.
Continual Mixed-Language Pre-Training for Extremely Low-Resource Neural Machine Translation
The data scarcity in low-resource languages has become a bottleneck to building robust neural machine translation systems.
Data Augmentation by Concatenation for Low-Resource Translation: A Mystery and a Solution
In this paper, we investigate the driving factors behind concatenation, a simple but effective data augmentation method for low-resource neural machine translation.
Low-Resource Neural Machine Translation for Southern African Languages
Motivated by this challenge we compare zero-shot learning, transfer learning and multilingual learning on three Bantu languages (Shona, isiXhosa and isiZulu) and English.
Low-Resource Machine Translation Training Curriculum Fit for Low-Resource Languages
We conduct an empirical study of neural machine translation (NMT) for truly low-resource languages, and propose a training curriculum fit for cases when both parallel training data and compute resource are lacking, reflecting the reality of most of the world's languages and the researchers working on these languages.
The Effectiveness of Morphology-aware Segmentation in Low-Resource Neural Machine Translation
This paper evaluates the performance of several modern subword segmentation methods in a low-resource neural machine translation setting.
A Simple and General Strategy for Referential Problem in Low-Resource Neural Machine Translation
This paper aims to solve a series of referential problems in sequence decoding caused by data sparsity and corpus scarce in low-resource Neural Machine Translation (NMT), including pronoun missing, reference error, bias and so on.
Dynamic Curriculum Learning for Low-Resource Neural Machine Translation
This eases training by highlighting easy samples that the current model has enough competence to learn.
A Hybrid Approach for Improved Low Resource Neural Machine Translation using Monolingual Data
Many language pairs are low resource, meaning the amount and/or quality of available parallel data is not sufficient to train a neural machine translation (NMT) model which can reach an acceptable standard of accuracy.