Visual Dialog

31 papers with code · Natural Language Processing
Subtask of Dialogue

Visual Dialog requires an AI agent to hold a meaningful dialog with humans in natural, conversational language about visual content. Specifically, given an image, a dialog history, and a follow-up question about the image, the task is to answer the question.

Benchmarks

Greatest papers with code

Visual Dialog

CVPR 2017 facebookresearch/ParlAI

We introduce the task of Visual Dialog, which requires an AI agent to hold a meaningful dialog with humans in natural, conversational language about visual content.

CHATBOT VISUAL DIALOG

Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning

ICCV 2017 batra-mlp-lab/visdial-rl

Specifically, we pose a cooperative 'image guessing' game between two agents -- Qbot and Abot -- who communicate in natural language dialog so that Qbot can select an unseen image from a lineup of images.

VISUAL DIALOG VISUAL QUESTION ANSWERING

Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model

NeurIPS 2017 jiasenlu/visDial.pytorch

In contrast, discriminative dialog models (D) that are trained to rank a list of candidate human responses outperform their generative counterparts; in terms of automatic metrics, diversity, and informativeness of the responses.

METRIC LEARNING TRANSFER LEARNING VISUAL DIALOG

Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline

ECCV 2020 vmurahari3/visdial-bert

Next, we find that additional finetuning using "dense" annotations in VisDial leads to even higher NDCG -- more than 10% over our base model -- but hurts MRR -- more than 17% below our base model!

LANGUAGE MODELLING REPRESENTATION LEARNING VISUAL DIALOG

DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue

17 Nov 2019JXZe/DualVD

More importantly, we can tell which modality (visual or semantic) has more contribution in answering the current question by visualizing the gate values.

FEATURE SELECTION QUESTION ANSWERING VISUAL DIALOG VISUAL QUESTION ANSWERING

Recursive Visual Attention in Visual Dialog

CVPR 2019 yuleiniu/rva

Visual dialog is a challenging vision-language task, which requires the agent to answer multi-round questions about an image.

QUESTION ANSWERING VISUAL DIALOG VISUAL QUESTION ANSWERING

Large-Scale Answerer in Questioner's Mind for Visual Dialog Question Generation

ICLR 2019 naver/aqm-plus

Answerer in Questioner's Mind (AQM) is an information-theoretic framework that has been recently proposed for task-oriented dialog systems.

QUESTION GENERATION VISUAL DIALOG

Answerer in Questioner's Mind: Information Theoretic Approach to Goal-Oriented Visual Dialog

NeurIPS 2018 naver/aqm-plus

Goal-oriented dialogue tasks occur when a questioner asks an action-oriented question and an answerer responds with the intent of letting the questioner know a correct action to take.

GOAL-ORIENTED DIALOG VISUAL DIALOG

Dialog-based Interactive Image Retrieval

NeurIPS 2018 XiaoxiaoGuo/fashion-retrieval

Experiments on both simulated and real-world data show that 1) our proposed learning framework achieves better accuracy than other supervised and reinforcement learning baselines and 2) user feedback based on natural language rather than pre-specified attributes leads to more effective retrieval results, and a more natural and expressive communication interface.

IMAGE RETRIEVAL VISUAL DIALOG