FLoRes (Facebook Low Resource MT Benchmark)

Introduced by Guzm{\'a}n et al. in The FLORES Evaluation Datasets for Low-Resource Machine Translation: Nepali--English and Sinhala--English

FLoRes is a benchmark dataset for machine translation between English and four low-resource languages, Nepali, Sinhala, Khmer, and Pashto, based on sentences translated from Wikipedia. The FLoRes project has two versions: FLoRes-101 and FLoRes-200.

  • FLoRes-101: This was the first version of the dataset. It allowed researchers to measure the quality of translations through 10,100 different translation directions.

  • FLoRes-200: This is an updated version of the dataset. It doubles the existing language coverage of FLoRes-101. Given the nature of the new languages, which have less standardization and require more specialized professional translations, the verification process became more complex.

Papers


Paper Code Results Date Stars

Tasks


Similar Datasets


License


Modalities


Languages