8 dataset results for Reading Comprehension AND Russian

XQuAD (Cross-lingual Question Answering Dataset) is a benchmark dataset for evaluating cross-lingual question answering performance. The dataset consists of a subset of 240 paragraphs and 1190 question-answer pairs from the development set of SQuAD v1.1 (Rajpurkar et al., 2016) together with their professional translations into ten languages: Spanish, German, Greek, Russian, Turkish, Arabic, Vietnamese, Thai, Chinese, and Hindi. Consequently, the dataset is entirely parallel across 11 languages.

171 PAPERS • 1 BENCHMARK

Belebele

Belebele is a multiple-choice machine reading comprehension (MRC) dataset spanning 122 language variants. This dataset enables the evaluation of mono- and multi-lingual models in high-, medium-, and low-resource languages. Each question has four multiple-choice answers and is linked to a short passage from the FLORES-200 dataset. The human annotation procedure was carefully curated to create questions that discriminate between different levels of generalizable language comprehension and is reinforced by extensive quality checks. While all questions directly relate to the passage, the English dataset on its own proves difficult enough to challenge state-of-the-art language models. Being fully parallel, this dataset enables direct comparison of model performance across all languages. Belebele opens up new avenues for evaluating and analyzing the multilingual abilities of language models and NLP systems.

20 PAPERS • NO BENCHMARKS YET

RUSSE

RUSSE (Russian Words in Context (based on RUSSE))

WiC: The Word-in-Context Dataset A reliable benchmark for the evaluation of context-sensitive word embeddings.

7 PAPERS • 1 BENCHMARK

SberQuAD (Sberbank Question Answering Dataset)

A large scale analogue of Stanford SQuAD in the Russian language - is a valuable resource that has not been properly presented to the scientific community.

7 PAPERS • 1 BENCHMARK

MuSeRC

MuSeRC (Russian Multi-Sentence Reading Comprehension)

We present a reading comprehension challenge in which questions can only be answered by taking into account information from multiple sentences. The dataset is the first to study multi-sentence inference at scale, with an open-ended set of question types that requires reasoning skills.

6 PAPERS • 1 BENCHMARK

XQA

XQA is a data which consists of a total amount of 90k question-answer pairs in nine languages for cross-lingual open-domain question answering.

6 PAPERS • NO BENCHMARKS YET

Taiga Corpus (An open-source corpus for machine learning.)

Taiga is a corpus, where text sources and their meta-information are collected according to popular ML tasks.

5 PAPERS • NO BENCHMARKS YET

NEREL-BIO

NEREL-BIO is an annotation scheme and corpus of PubMed abstracts in Russian and English. It contains annotations for 700+ Russian and 100+ English abstracts. All English PubMed annotations have corresponding Russian counterparts. NEREL-BIO comprises the following specific features: annotation of nested named entities, it can be used as a benchmark for cross-domain (NEREL -> NEREL-BIO) and cross-language (English -> Russian) transfer.

1 PAPER • NO BENCHMARKS YET

Datasets

8 dataset results for Reading Comprehension AND Russian