Machine Translation
2148 papers with code • 81 benchmarks • 76 datasets
Machine translation is the task of translating a sentence in a source language to a different target language.
Approaches for machine translation can range from rule-based to statistical to neural-based. More recently, encoder-decoder attention-based architectures like BERT have attained major improvements in machine translation.
One of the most popular datasets used to benchmark machine translation systems is the WMT family of datasets. Some of the most commonly used evaluation metrics for machine translation systems include BLEU, METEOR, NIST, and others.
( Image credit: Google seq2seq )
Libraries
Use these libraries to find Machine Translation models and implementationsSubtasks
Latest papers with no code
Simultaneous Interpretation Corpus Construction by Large Language Models in Distant Language Pair
In Simultaneous Machine Translation (SiMT) systems, training with a simultaneous interpretation (SI) corpus is an effective method for achieving high-quality yet low-latency systems.
Enhancing Length Extrapolation in Sequential Models with Pointer-Augmented Neural Memory
We propose Pointer-Augmented Neural Memory (PANM) to help neural networks understand and apply symbol processing to new, longer sequences of data.
Neuron Specialization: Leveraging intrinsic task modularity for multilingual machine translation
Training a unified multilingual model promotes knowledge transfer but inevitably introduces negative interference.
GeMQuAD : Generating Multilingual Question Answering Datasets from Large Language Models using Few Shot Learning
The emergence of Large Language Models (LLMs) with capabilities like In-Context Learning (ICL) has ushered in new possibilities for data generation across various domains while minimizing the need for extensive data collection and modeling techniques.
Multilingual Evaluation of Semantic Textual Relatedness
The explosive growth of online content demands robust Natural Language Processing (NLP) techniques that can capture nuanced meanings and cultural context across diverse languages.
Extending Translate-Train for ColBERT-X to African Language CLIR
This paper describes the submission runs from the HLTCOE team at the CIRAL CLIR tasks for African languages at FIRE 2023.
Introducing L2M3, A Multilingual Medical Large Language Model to Advance Health Equity in Low-Resource Regions
Addressing the imminent shortfall of 10 million health workers by 2030, predominantly in Low- and Middle-Income Countries (LMICs), this paper introduces an innovative approach that harnesses the power of Large Language Models (LLMs) integrated with machine translation models.
Charles Translator: A Machine Translation System between Ukrainian and Czech
We present Charles Translator, a machine translation system between Ukrainian and Czech, developed as part of a society-wide effort to mitigate the impact of the Russian-Ukrainian war on individuals and society.
An inclusive review on deep learning techniques and their scope in handwriting recognition
This paper presents a survey on the existing studies of deep learning in handwriting recognition field.
Interplay of Machine Translation, Diacritics, and Diacritization
We examine these two questions in both high-resource (HR) and low-resource (LR) settings across 55 different languages (36 African languages and 19 European languages).