Machine Translation

2148 papers with code • 80 benchmarks • 76 datasets

Machine translation is the task of translating a sentence in a source language to a different target language.

Approaches for machine translation can range from rule-based to statistical to neural-based. More recently, encoder-decoder attention-based architectures like BERT have attained major improvements in machine translation.

One of the most popular datasets used to benchmark machine translation systems is the WMT family of datasets. Some of the most commonly used evaluation metrics for machine translation systems include BLEU, METEOR, NIST, and others.

( Image credit: Google seq2seq )

Libraries

Use these libraries to find Machine Translation models and implementations
24 papers
1,206
15 papers
29,233
14 papers
124,889
See all 14 libraries.

Bridging the Gap between Different Vocabularies for LLM Ensemble

xydaytoy/eva 15 Apr 2024

Ensembling different large language models (LLMs) to unleash their complementary potential and harness their individual strengths is highly valuable.

3
15 Apr 2024

Investigating Neural Machine Translation for Low-Resource Languages: Using Bavarian as a Case Study

faceonlive/ai-research 12 Apr 2024

Machine Translation has made impressive progress in recent years offering close to human-level performance on many languages, but studies have primarily focused on high-resource languages with broad online presence and resources.

144
12 Apr 2024

Guiding Large Language Models to Post-Edit Machine Translation with Error Annotations

faceonlive/ai-research 11 Apr 2024

Machine Translation (MT) remains one of the last NLP tasks where large language models (LLMs) have not yet replaced dedicated supervised systems.

144
11 Apr 2024

Curated Datasets and Neural Models for Machine Translation of Informal Registers between Mayan and Spanish Vernaculars

faceonlive/ai-research 11 Apr 2024

The Mayan languages comprise a language family with an ancient history, millions of speakers, and immense cultural value, that, nevertheless, remains severely underrepresented in terms of resources and global exposure.

144
11 Apr 2024

Accelerating Inference in Large Language Models with a Unified Layer Skipping Strategy

faceonlive/ai-research 10 Apr 2024

Recently, dynamic computation methods have shown notable acceleration for Large Language Models (LLMs) by skipping several layers of computations through elaborate heuristics or additional predictors.

144
10 Apr 2024

Control-DAG: Constrained Decoding for Non-Autoregressive Directed Acyclic T5 using Weighted Finite State Automata

faceonlive/ai-research 10 Apr 2024

The Directed Acyclic Transformer is a fast non-autoregressive (NAR) model that performs well in Neural Machine Translation.

144
10 Apr 2024

SLPL SHROOM at SemEval2024 Task 06: A comprehensive study on models ability to detect hallucination

faceonlive/ai-research 7 Apr 2024

Language models, particularly generative models, are susceptible to hallucinations, generating outputs that contradict factual knowledge or the source text.

144
07 Apr 2024

F-MALLOC: Feed-forward Memory Allocation for Continual Learning in Neural Machine Translation

wjmacro/continualmt 7 Apr 2024

In the evolving landscape of Neural Machine Translation (NMT), the pretrain-then-finetune paradigm has yielded impressive results.

2
07 Apr 2024

Low-Resource Machine Translation through Retrieval-Augmented LLM Prompting: A Study on the Mambai Language

raphaelmerx/mambai 7 Apr 2024

Leveraging a novel corpus derived from a Mambai language manual and additional sentences translated by a native speaker, we examine the efficacy of few-shot LLM prompting for machine translation (MT) in this low-resource context.

2
07 Apr 2024

KazQAD: Kazakh Open-Domain Question Answering Dataset

is2ai/kazqad 6 Apr 2024

We introduce KazQAD -- a Kazakh open-domain question answering (ODQA) dataset -- that can be used in both reading comprehension and full ODQA settings, as well as for information retrieval experiments.

1
06 Apr 2024