Texts

EarthVQA (A multi-modal multi-task VQA dataset for remote sensing)

Introduced by Wang et al. in EarthVQA: Towards Queryable Earth via Relational Reasoning-Based Remote Sensing Visual Question Answering

Earth vision research typically focuses on extracting geospatial object locations and categories but neglects the exploration of relations between objects and comprehensive reasoning. Based on city planning needs, we develop a multi-modal multi-task VQA dataset (EarthVQA) to advance relational reasoning-based judging, counting, and comprehensive analysis. The EarthVQA dataset contains 6000 images, corresponding semantic masks, and 208,593 QA pairs with urban and rural governance requirements embedded.

Characteristics:

Multi-level annotations: The paired image-mask-QA pairs assisst for relational reasoning-based remote sensing visual question answering.
Applicable QA pairs: All QA pairs are designed based on the actual city planning needs.

Benchmarks

Add a new result Link an existing benchmark

Trend	Task	Dataset Variant	Best Model	Paper	Code
	Visual Question Answering	EarthVQA	SOBA

Papers

Paper	Code	Results	Date	Stars

Dataset Loaders

Add Remove

No data loaders found. You can submit your data loader here.

Tasks

Similar Datasets

LoveDA

Usage

License

CC BY-NC

Modalities

Images
Texts

Languages

English

EarthVQA (A multi-modal multi-task VQA dataset for remote sensing)

Benchmarks Edit Add a new result Link an existing benchmark

Papers

Dataset Loaders Edit Add Remove

Tasks Edit