Reading Comprehension

568 papers with code • 7 benchmarks • 95 datasets

Most current question answering datasets frame the task as reading comprehension where the question is about a paragraph or document and the answer often is a span in the document.

Some specific tasks of reading comprehension include multi-modal machine reading comprehension and textual machine reading comprehension, among others. In the literature, machine reading comprehension can be divide into four categories: cloze style, multiple choice, span prediction, and free-form answer. Read more about each category here.

Benchmark datasets used for testing a model's reading comprehension abilities include MovieQA, ReCoRD, and RACE, among others.

The Machine Reading group at UCL also provides an overview of reading comprehension tasks.

Figure source: A Survey on Machine Reading Comprehension: Tasks, Evaluation Metrics and Benchmark Datasets

Libraries

Use these libraries to find Reading Comprehension models and implementations
4 papers
1,102
2 papers
8,533
See all 6 libraries.

ViTextVQA: A Large-Scale Visual Question Answering Dataset for Evaluating Vietnamese Text Comprehension in Images

minhquan6203/vitextvqa-dataset 16 Apr 2024

Visual Question Answering (VQA) is a complicated task that requires the capability of simultaneously processing natural language and images.

5
16 Apr 2024

NoticIA: A Clickbait Article Summarization Dataset in Spanish

faceonlive/ai-research 11 Apr 2024

We present NoticIA, a dataset consisting of 850 Spanish news articles featuring prominent clickbait headlines, each paired with high-quality, single-sentence generative summarizations written by humans.

144
11 Apr 2024

Interpreting Themes from Educational Stories

faceonlive/ai-research 8 Apr 2024

Reading comprehension continues to be a crucial research focus in the NLP community.

144
08 Apr 2024

KazQAD: Kazakh Open-Domain Question Answering Dataset

is2ai/kazqad 6 Apr 2024

We introduce KazQAD -- a Kazakh open-domain question answering (ODQA) dataset -- that can be used in both reading comprehension and full ODQA settings, as well as for information retrieval experiments.

1
06 Apr 2024

Sailor: Open Language Models for South-East Asia

sail-sg/sailor-llm 4 Apr 2024

We present Sailor, a family of open language models ranging from 0. 5B to 7B parameters, tailored for South-East Asian (SEA) languages.

69
04 Apr 2024

ST-LLM: Large Language Models Are Effective Temporal Learners

TencentARC/ST-LLM 30 Mar 2024

In this paper, we investigate a straightforward yet unexplored question: Can we feed all spatial-temporal tokens into the LLM, thus delegating the task of video sequence modeling to the LLMs?

36
30 Mar 2024

Latxa: An Open Language Model and Evaluation Suite for Basque

hitz-zentroa/latxa 29 Mar 2024

We introduce Latxa, a family of large language models for Basque ranging from 7 to 70 billion parameters.

11
29 Mar 2024

ArabicaQA: A Comprehensive Dataset for Arabic Question Answering

datascienceuibk/arabicaqa 26 Mar 2024

In conclusion, ArabicaQA, AraDPR, and the benchmarking of LLMs in Arabic question answering offer significant advancements in the field of Arabic NLP.

9
26 Mar 2024

ChroniclingAmericaQA: A Large-scale Question Answering Dataset based on Historical American Newspaper Pages

datascienceuibk/chroniclingamericaqa 26 Mar 2024

Therefore, to enable realistic testing of QA models, our dataset can be used in three different ways: answering questions from raw and noisy content, answering questions from cleaner, corrected version of the content, as well as answering questions from scanned images of newspaper pages.

4
26 Mar 2024

WangchanLion and WangchanX MRC Eval

vistec-ai/wangchanlion 24 Mar 2024

Our model is based on SEA-LION and a collection of instruction following datasets.

2
24 Mar 2024