Belebele

Introduced by Bandarkar et al. in The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants

Belebele is a multiple-choice machine reading comprehension (MRC) dataset spanning 122 language variants. This dataset enables the evaluation of mono- and multi-lingual models in high-, medium-, and low-resource languages. Each question has four multiple-choice answers and is linked to a short passage from the FLORES-200 dataset. The human annotation procedure was carefully curated to create questions that discriminate between different levels of generalizable language comprehension and is reinforced by extensive quality checks. While all questions directly relate to the passage, the English dataset on its own proves difficult enough to challenge state-of-the-art language models. Being fully parallel, this dataset enables direct comparison of model performance across all languages. Belebele opens up new avenues for evaluating and analyzing the multilingual abilities of language models and NLP systems.

Homepage

Benchmarks

Add a new result Link an existing benchmark

No benchmarks yet. Start a new benchmark or link an existing one.

Papers

Paper	Code	Results	Date	Stars

Dataset Loaders

Add Remove

No data loaders found. You can submit your data loader here.

Tasks

Natural Language Understanding

Machine Reading Comprehension

Reading Comprehension (Zero-Shot)

Reading Comprehension (One-Shot)

Reading Comprehension (Few-Shot)

Natural Questions

Multilingual Machine Comprehension in English Hindi

Vietnamese Machine Reading Comprehension

Multiple-choice

Multilingual text classification

Multilingual NLP

Pretrained Multilingual Language Models

Similar Datasets

GPQA

XStoryCloze

XCOPA

AfriQA

Usage

License

CC BY-SA

Modalities

Languages

Mandarin Chinese

Standard Arabic

Mesopotamian Arabic

North Levantine Arabic

Moroccan Arabic

Egyptian Arabic

South Azerbaijani

North Azerbaijani

Central Kurdish

Nigerian Fulfulde

West Central Oromo

Luo (Kenya and Tanzania)

Standard Latvian

Malay (macrolanguage)

Nepali (macrolanguage)

Norwegian Bokmål

Nepali (individual language)

Oriya (macrolanguage)

Southern Pashto

Iranian Persian

Plateau Malagasy

Waray (Philippines)

Malay (individual language)