Machine Translation

2154 papers with code • 80 benchmarks • 77 datasets

Machine translation is the task of translating a sentence in a source language to a different target language.

Approaches for machine translation can range from rule-based to statistical to neural-based. More recently, encoder-decoder attention-based architectures like BERT have attained major improvements in machine translation.

One of the most popular datasets used to benchmark machine translation systems is the WMT family of datasets. Some of the most commonly used evaluation metrics for machine translation systems include BLEU, METEOR, NIST, and others.

( Image credit: Google seq2seq )

Libraries

Use these libraries to find Machine Translation models and implementations
24 papers
1,206
15 papers
29,292
14 papers
125,385
See all 14 libraries.

Control-DAG: Constrained Decoding for Non-Autoregressive Directed Acyclic T5 using Weighted Finite State Automata

faceonlive/ai-research 10 Apr 2024

The Directed Acyclic Transformer is a fast non-autoregressive (NAR) model that performs well in Neural Machine Translation.

186
10 Apr 2024

SLPL SHROOM at SemEval2024 Task 06: A comprehensive study on models ability to detect hallucination

faceonlive/ai-research 7 Apr 2024

Language models, particularly generative models, are susceptible to hallucinations, generating outputs that contradict factual knowledge or the source text.

186
07 Apr 2024

F-MALLOC: Feed-forward Memory Allocation for Continual Learning in Neural Machine Translation

wjmacro/continualmt 7 Apr 2024

In the evolving landscape of Neural Machine Translation (NMT), the pretrain-then-finetune paradigm has yielded impressive results.

2
07 Apr 2024

Low-Resource Machine Translation through Retrieval-Augmented LLM Prompting: A Study on the Mambai Language

raphaelmerx/mambai 7 Apr 2024

Leveraging a novel corpus derived from a Mambai language manual and additional sentences translated by a native speaker, we examine the efficacy of few-shot LLM prompting for machine translation (MT) in this low-resource context.

2
07 Apr 2024

KazQAD: Kazakh Open-Domain Question Answering Dataset

is2ai/kazqad 6 Apr 2024

We introduce KazQAD -- a Kazakh open-domain question answering (ODQA) dataset -- that can be used in both reading comprehension and full ODQA settings, as well as for information retrieval experiments.

1
06 Apr 2024

Large Language Models for Expansion of Spoken Language Understanding Systems to New Languages

samsung/mt-llm-nlu 3 Apr 2024

In the on-device scenario (tiny and not pretrained SLU), our method improved the Overall Accuracy from 5. 31% to 22. 06% over the baseline Global-Local Contrastive Learning Framework (GL-CLeF) method.

4
03 Apr 2024

Low-resource neural machine translation with morphological modeling

anzeyimana/kinmt_naacl2024 3 Apr 2024

An attention augmentation scheme to the transformer model is proposed in a generic form to allow integration of pre-trained language models and also facilitate modeling of word order relationships between the source and target languages.

0
03 Apr 2024

An image speaks a thousand words, but can everyone listen? On translating images for cultural relevance

simran-khanuja/image-transcreation 1 Apr 2024

First, we build three pipelines comprising state-of-the-art generative models to do the task.

10
01 Apr 2024

AAdaM at SemEval-2024 Task 1: Augmentation and Adaptation for Multilingual Semantic Textual Relatedness

uds-lsv/aadam 1 Apr 2024

This paper presents our system developed for the SemEval-2024 Task 1: Semantic Textual Relatedness for African and Asian Languages.

2
01 Apr 2024

KazParC: Kazakh Parallel Corpus for Machine Translation

is2ai/kazparc 28 Mar 2024

We introduce KazParC, a parallel corpus designed for machine translation across Kazakh, English, Russian, and Turkish.

1
28 Mar 2024