Browse SoTA > Computer Vision > Visual Question Answering

Visual Question Answering

153 papers with code · Computer Vision

Leaderboards

TREND DATASET BEST METHOD PAPER TITLE PAPER CODE COMPARE

Greatest papers with code

Learning to Reason: End-to-End Module Networks for Visual Question Answering

ICCV 2017 tensorflow/models

Natural language questions are inherently compositional, and many are most easily answered by reasoning about their decomposition into modular sub-problems.

VISUAL QUESTION ANSWERING

ParlAI: A Dialog Research Software Platform

EMNLP 2017 facebookresearch/ParlAI

We introduce ParlAI (pronounced "par-lay"), an open-source software platform for dialog research implemented in Python, available at http://parl. ai.

VISUAL QUESTION ANSWERING

Hadamard Product for Low-rank Bilinear Pooling

14 Oct 2016facebookresearch/ParlAI

Bilinear models provide rich representations compared with linear models.

VISUAL QUESTION ANSWERING

Bilinear Attention Networks

NeurIPS 2018 facebookresearch/pythia

In this paper, we propose bilinear attention networks (BAN) that find bilinear attention distributions to utilize given vision-language information seamlessly.

VISUAL QUESTION ANSWERING

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

CVPR 2018 facebookresearch/pythia

Top-down visual attention mechanisms have been used extensively in image captioning and visual question answering (VQA) to enable deeper image understanding through fine-grained analysis and even multiple steps of reasoning.

IMAGE CAPTIONING VISUAL QUESTION ANSWERING

Towards VQA Models That Can Read

CVPR 2019 facebookresearch/mmf

We show that LoRRA outperforms existing state-of-the-art VQA models on our TextVQA dataset.

VISUAL QUESTION ANSWERING

Pythia v0.1: the Winning Entry to the VQA Challenge 2018

26 Jul 2018facebookresearch/mmf

We demonstrate that by making subtle but important changes to the model architecture and the learning rate schedule, fine-tuning image features, and adding data augmentation, we can significantly improve the performance of the up-down model on VQA v2. 0 dataset -- from 65. 67% to 70. 22%.

DATA AUGMENTATION VISUAL QUESTION ANSWERING

Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge

CVPR 2018 peteanderson80/bottom-up-attention

This paper presents a state-of-the-art model for visual question answering (VQA), which won the first place in the 2017 VQA Challenge.

VISUAL QUESTION ANSWERING

A simple neural network module for relational reasoning

NeurIPS 2017 kimhc6028/relational-networks

Relational reasoning is a central component of generally intelligent behavior, but has proven difficult for neural networks to learn.

QUESTION ANSWERING RELATIONAL REASONING VISUAL QUESTION ANSWERING