MP-DocVQA (Multipage Document Visual Question Answering)

Introduced by Tito et al. in Hierarchical multimodal transformers for Multi-Page DocVQA

The dataset is aimed to perform Visual Question Answering on multipage industry scanned documents. The questions and answers are reused from Single Page DocVQA (SP-DocVQA) dataset. The images also corresponds to the same in original dataset with previous and posterior pages with a limit of up to 20 pages per document.

Homepage