Browse SoTA > Reasoning > Visual Reasoning > Visual Commonsense Reasoning

Visual Commonsense Reasoning

11 papers with code · Reasoning
Subtask of Visual Reasoning

Benchmarks

No evaluation results yet. Help compare methods by submit evaluation metrics.

Greatest papers with code

From Recognition to Cognition: Visual Commonsense Reasoning

CVPR 2019 rowanz/r2c

While this task is easy for humans, it is tremendously difficult for today's vision systems, requiring higher-order cognition and commonsense reasoning about the world.

VISUAL COMMONSENSE REASONING

VL-BERT: Pre-training of Generic Visual-Linguistic Representations

ICLR 2020 jackroos/VL-BERT

We introduce a new pre-trainable generic representation for visual-linguistic tasks, called Visual-Linguistic BERT (VL-BERT for short).

LANGUAGE MODELLING QUESTION ANSWERING VISUAL COMMONSENSE REASONING VISUAL QUESTION ANSWERING

ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks

NeurIPS 2019 jiasenlu/vilbert_beta

We present ViLBERT (short for Vision-and-Language BERT), a model for learning task-agnostic joint representations of image content and natural language.

IMAGE RETRIEVAL QUESTION ANSWERING VISUAL COMMONSENSE REASONING VISUAL QUESTION ANSWERING

ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks

NeurIPS 2019 jiasenlu/vilbert_beta

We present ViLBERT (short for Vision-and-Language BERT), a model for learning task-agnostic joint representations of image content and natural language.

IMAGE RETRIEVAL QUESTION ANSWERING VISUAL COMMONSENSE REASONING VISUAL QUESTION ANSWERING

UNITER: UNiversal Image-TExt Representation Learning

25 Sep 2019ChenRocks/UNITER

Different from previous work that applies joint random masking to both modalities, we use conditional masking on pre-training tasks (i. e., masked language/region modeling is conditioned on full observation of image/text).

LANGUAGE MODELLING QUESTION ANSWERING REPRESENTATION LEARNING TEXT MATCHING VISUAL COMMONSENSE REASONING VISUAL ENTAILMENT VISUAL QUESTION ANSWERING

Heterogeneous Graph Learning for Visual Commonsense Reasoning

NeurIPS 2019 yuweijiang/HGL-pytorch

Our HGL consists of a primal vision-to-answer heterogeneous graph (VAHG) module and a dual question-to-answer heterogeneous graph (QAHG) module to interactively refine reasoning paths for semantic agreement.

VISUAL COMMONSENSE REASONING

Heterogeneous Graph Learning for Visual Commonsense Reasoning

NeurIPS 2019 yuweijiang/HGL-pytorch

Our HGL consists of a primal vision-to-answer heterogeneous graph (VAHG) module and a dual question-to-answer heterogeneous graph (QAHG) module to interactively refine reasoning paths for semantic agreement.

VISUAL COMMONSENSE REASONING

TAB-VCR: Tags and Attributes based VCR Baselines

NeurIPS 2019 Deanplayerljx/tab-vcr

Despite impressive recent progress that has been reported on tasks that necessitate reasoning, such as visual question answering and visual dialog, models often exploit biases in datasets.

QUESTION ANSWERING VISUAL COMMONSENSE REASONING VISUAL DIALOG VISUAL QUESTION ANSWERING

TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning Baselines

NeurIPS 2019 Deanplayerljx/tab-vcr

Despite impressive recent progress that has been reported on tasks that necessitate reasoning, such as visual question answering and visual dialog, models often exploit biases in datasets.

QUESTION ANSWERING VISUAL COMMONSENSE REASONING VISUAL DIALOG VISUAL QUESTION ANSWERING

Connective Cognition Network for Directional Visual Commonsense Reasoning

NeurIPS 2019 AmingWu/CCN

Inspired by this idea, towards VCR, we propose a connective cognition network (CCN) to dynamically reorganize the visual neuron connectivity that is contextualized by the meaning of questions and answers.

VISUAL COMMONSENSE REASONING