3 dataset results for Machine Translation AND Texts AND Tamil

WMT 2014

WMT 2014 is a collection of datasets used in shared tasks of the Ninth Workshop on Statistical Machine Translation. The workshop featured four tasks:

274 PAPERS • 11 BENCHMARKS

Samanantar

Samanantar is the largest publicly available parallel corpora collection for Indic languages: Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, Telugu. The corpus has 49.6M sentence pairs between English to Indian Languages.

37 PAPERS • NO BENCHMARKS YET

WMT 2020

WMT 2020 is a collection of datasets used in shared tasks of the Fifth Conference on Machine Translation. The conference builds on a series of annual workshops and conferences on Statistical Machine Translation.

33 PAPERS • 1 BENCHMARK

Datasets

3 dataset results for Machine Translation AND Texts AND Tamil