NExT-QA is a VideoQA benchmark targeting the explanation of video contents. It challenges QA models to reason about the causal and temporal actions and understand the rich object interactions in daily activities. It supports both multi-choice and open-ended QA tasks. The videos are untrimmed and the questions usually invoke local video contents for answers.
80 PAPERS • 5 BENCHMARKS
This dataset is aimed to study the existing reading comprehension models' capability to perform temporal reasoning, and see whether they are sensitive to the temporal description in the given question.
16 PAPERS • NO BENCHMARKS YET