TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Visual Question Answering (VQA)	DVQA test-familiar	PReFIL (Oracle OCR)	1:1 Accuracy	96.37	# 1
Visual Question Answering (VQA)	FigureQA - test 1	PReFIL	1:1 Accuracy	94.88	# 1
Visual Question Answering (VQA)	PlotQA-D1	PReFIL	1:1 Accuracy	57.91	# 2
Visual Question Answering (VQA)	PlotQA-D2	PReFIL	1:1 Accuracy	10.37	# 3

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/answering-questions-about-data-visualizations/visual-question-answering-vqa-on-dvqa-test)](https://paperswithcode.com/sota/visual-question-answering-vqa-on-dvqa-test?p=answering-questions-about-data-visualizations)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/answering-questions-about-data-visualizations/visual-question-answering-on-figureqa-test-1)](https://paperswithcode.com/sota/visual-question-answering-on-figureqa-test-1?p=answering-questions-about-data-visualizations)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/answering-questions-about-data-visualizations/visual-question-answering-on-plotqa-d1)](https://paperswithcode.com/sota/visual-question-answering-on-plotqa-d1?p=answering-questions-about-data-visualizations)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/answering-questions-about-data-visualizations/visual-question-answering-on-plotqa-d2)](https://paperswithcode.com/sota/visual-question-answering-on-plotqa-d2?p=answering-questions-about-data-visualizations)`

Answering Questions about Data Visualizations using Efficient Bimodal Fusion

5 Aug 2019 · Kushal Kafle, Robik Shrestha, Brian Price, Scott Cohen, Christopher Kanan ·

Chart question answering (CQA) is a newly proposed visual question answering (VQA) task where an algorithm must answer questions about data visualizations, e.g. bar charts, pie charts, and line graphs. CQA requires capabilities that natural-image VQA algorithms lack: fine-grained measurements, optical character recognition, and handling out-of-vocabulary words in both questions and answers. Without modifications, state-of-the-art VQA algorithms perform poorly on this task. Here, we propose a novel CQA algorithm called parallel recurrent fusion of image and language (PReFIL). PReFIL first learns bimodal embeddings by fusing question and image features and then intelligently aggregates these learned embeddings to answer the given question. Despite its simplicity, PReFIL greatly surpasses state-of-the art systems and human baselines on both the FigureQA and DVQA datasets. Additionally, we demonstrate that PReFIL can be used to reconstruct tables by asking a series of questions about a chart.

PDF Abstract

Code

Add Remove Mark official

kushalkafle/PREFIL official

Tasks

Add Remove

Chart Question Answering

Optical Character Recognition

Optical Character Recognition (OCR)

Question Answering

Visual Question Answering

Visual Question Answering (VQA)

Datasets

Visual Question Answering

CLEVR FigureQA

DVQA

PlotQA

Results from the Paper

Edit

Ranked #1 on Visual Question Answering (VQA) on DVQA test-familiar

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Visual Question Answering (VQA)	DVQA test-familiar	PReFIL (Oracle OCR)	1:1 Accuracy	96.37	# 1	Compare
Visual Question Answering (VQA)	FigureQA - test 1	PReFIL	1:1 Accuracy	94.88	# 1	Compare
Visual Question Answering (VQA)	PlotQA-D1	PReFIL	1:1 Accuracy	57.91	# 2	Compare
Visual Question Answering (VQA)	PlotQA-D2	PReFIL	1:1 Accuracy	10.37	# 3	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Answering Questions about Data Visualizations using Efficient Bimodal Fusion

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove