Reading Comprehension

568 papers with code • 7 benchmarks • 95 datasets

Most current question answering datasets frame the task as reading comprehension where the question is about a paragraph or document and the answer often is a span in the document.

Some specific tasks of reading comprehension include multi-modal machine reading comprehension and textual machine reading comprehension, among others. In the literature, machine reading comprehension can be divide into four categories: cloze style, multiple choice, span prediction, and free-form answer. Read more about each category here.

Benchmark datasets used for testing a model's reading comprehension abilities include MovieQA, ReCoRD, and RACE, among others.

The Machine Reading group at UCL also provides an overview of reading comprehension tasks.

Figure source: A Survey on Machine Reading Comprehension: Tasks, Evaluation Metrics and Benchmark Datasets

Benchmarks

Add a Result

These leaderboards are used to track progress in Reading Comprehension

Dataset	Best Model	Compare
RACE	ALBERT (Ensemble)	See all
ReClor	Rational Reasoner / IDOL	See all
MuSeRC	Golden Transformer	See all
AdversarialQA	RoBERTa-Large	See all
CrowdSource QA	BERT	See all
ReCAM	NAL	See all
RadQA	BERT pretrained on MIMIC-III	See all

Libraries

Use these libraries to find Reading Comprehension models and implementations

huggingface/transformers

7 papers

124,984

facebookresearch/ParlAI

4 papers

10,426

baidu/DuReader

4 papers

1,102

NVIDIA/Megatron-LM

2 papers

8,561

See all 6 libraries.

Datasets

Subtasks

LAMBADA

Question Selection

Multi-Hop Reading Comprehension

Implicatures

Logical Reasoning Reading Comprehension

English Proverbs

Fantasy Reasoning

Figure Of Speech Detection

Formal Fallacies Syllogisms Negation

GRE Reading Comprehension

Hyperbaton

Movie Dialog Same Or Different

Nonsense Words Grammar

Phrase Relatedness

RACE-h

RACE-m

Latest papers with no code

Most implemented Social Latest No code

PDF-MVQA: A Dataset for Multimodal Information Retrieval in PDF-based Visual Question Answering

no code yet • 19 Apr 2024

Document Question Answering (QA) presents a challenge in understanding visually-rich documents (VRD), particularly those dominated by lengthy textual content like research journal articles.

Paper
Add Code

emrQA-msquad: A Medical Dataset Structured with the SQuAD V2.0 Framework, Enriched with emrQA Medical Information

no code yet • 18 Apr 2024

Machine Reading Comprehension (MRC) holds a pivotal role in shaping Medical Question Answering Systems (QAS) and transforming the landscape of accessing and applying medical information.

Paper
Add Code

Question Difficulty Ranking for Multiple-Choice Reading Comprehension

no code yet • 16 Apr 2024

Additionally, zero-shot comparative assessment is more effective at difficulty ranking than the absolute assessment and even the task transfer approaches at question difficulty ranking with a Spearman's correlation of 40. 4%.

Paper
Add Code

Fewer Truncations Improve Language Modeling

no code yet • 16 Apr 2024

In large language model training, input documents are typically concatenated together and then split into sequences of equal length to avoid padding tokens.

Paper
Add Code

Automatic Generation and Evaluation of Reading Comprehension Test Items with Large Language Models

no code yet • 11 Apr 2024

We then used this protocol and the dataset to evaluate the quality of items generated by Llama 2 and GPT-4.

Paper
Add Code

CausalBench: A Comprehensive Benchmark for Causal Learning Capability of Large Language Models

no code yet • 9 Apr 2024

To address these challenges, this paper proposes a comprehensive benchmark, namely CausalBench, to evaluate the causality understanding capabilities of LLMs.

Paper
Add Code

LLMs' Reading Comprehension Is Affected by Parametric Knowledge and Struggles with Hypothetical Statements

no code yet • 9 Apr 2024

In particular, while some models prove virtually unaffected by knowledge conflicts in affirmative and negative contexts, when faced with more semantically involved modal and conditional environments, they often fail to separate the text from their internal knowledge.

Paper
Add Code

XL$^2$Bench: A Benchmark for Extremely Long Context Understanding with Long-range Dependencies

no code yet • 8 Apr 2024

However, prior benchmarks create datasets that ostensibly cater to long-text comprehension by expanding the input of traditional tasks, which falls short to exhibit the unique characteristics of long-text understanding, including long dependency tasks and longer text length compatible with modern LLMs' context window size.

Paper
Add Code

The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models

no code yet • 8 Apr 2024

Large Language Models (LLMs) have transformed the Natural Language Processing (NLP) landscape with their remarkable ability to understand and generate human-like text.

Paper
Add Code

Explaining EDA synthesis errors with LLMs

no code yet • 7 Apr 2024

Training new engineers in digital design is a challenge, particularly when it comes to teaching the complex electronic design automation (EDA) tooling used in this domain.

Paper
Add Code

Reading Comprehension

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result