Browse SoTA > Computer Vision > Image Captioning

Image Captioning

166 papers with code · Computer Vision

Leaderboards

Greatest papers with code

Can Active Memory Replace Attention?

NeurIPS 2016 tensorflow/models

Several mechanisms to focus attention of a neural network on selected parts of its input or memory have been used successfully in deep learning models in recent years.

IMAGE CAPTIONING MACHINE TRANSLATION

Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge

21 Sep 2016tensorflow/models

Automatically describing the content of an image is a fundamental problem in artificial intelligence that connects computer vision and natural language processing.

IMAGE CAPTIONING

One Model To Learn Them All

16 Jun 2017tensorflow/tensor2tensor

We present a single model that yields good results on a number of problems spanning multiple domains.

IMAGE CAPTIONING IMAGE CLASSIFICATION MULTI-TASK LEARNING

Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models

7 Oct 2016facebookresearch/fairseq-py

We observe that our method consistently outperforms BS and previously proposed techniques for diverse decoding from neural sequence models.

IMAGE CAPTIONING MACHINE TRANSLATION QUESTION GENERATION TEXT GENERATION TIME SERIES

MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition

27 Jul 2016deepinsight/insightface

In this paper, we design a benchmark task and provide the associated datasets for recognizing face images and link them to corresponding entity keys in a knowledge base.

FACE RECOGNITION IMAGE CAPTIONING

Show and Tell: A Neural Image Caption Generator

CVPR 2015 karpathy/neuraltalk

Experiments on several datasets show the accuracy of the model and the fluency of the language it learns solely from image descriptions.

IMAGE CAPTIONING TEXT GENERATION

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

CVPR 2018 facebookresearch/pythia

Top-down visual attention mechanisms have been used extensively in image captioning and visual question answering (VQA) to enable deeper image understanding through fine-grained analysis and even multiple steps of reasoning.

IMAGE CAPTIONING VISUAL QUESTION ANSWERING

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 Feb 2015sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning

Inspired by recent work in machine translation and object detection, we introduce an attention based model that automatically learns to describe the content of images.

IMAGE CAPTIONING

Self-critical Sequence Training for Image Captioning

CVPR 2017 ruotianluo/neuraltalk2.pytorch

In this paper we consider the problem of optimizing image captioning systems using reinforcement learning, and show that by carefully optimizing our systems using the test metrics of the MSCOCO task, significant gains in performance can be realized.

IMAGE CAPTIONING POLICY GRADIENT METHODS

Grad-CAM: Why did you say that?

22 Nov 2016ramprs/grad-cam

We propose a technique for making Convolutional Neural Network (CNN)-based models more transparent by visualizing input regions that are 'important' for predictions -- or visual explanations.

IMAGE CAPTIONING VISUAL QUESTION ANSWERING