Caption Generation

89 papers with code • 1 benchmarks • 1 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find Caption Generation models and implementations

Datasets


Most implemented papers

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning 10 Feb 2015

Inspired by recent work in machine translation and object detection, we introduce an attention based model that automatically learns to describe the content of images.

Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks

adityac94/Grad_CAM_plus_plus 30 Oct 2017

Over the last decade, Convolutional Neural Network (CNN) models have been highly successful in solving complex vision problems.

Recurrent Neural Network Regularization

wojzaremba/lstm 8 Sep 2014

We present a simple regularization technique for Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units.

Microsoft COCO Captions: Data Collection and Evaluation Server

tylin/coco-caption 1 Apr 2015

In this paper we describe the Microsoft COCO Caption dataset and evaluation server.

Where to put the Image in an Image Caption Generator

mtanti/where-image2 27 Mar 2017

When a recurrent neural network language model is used for caption generation, the image information can be fed to the neural network either by directly incorporating it in the RNN -- conditioning the language model by `injecting' image features -- or in a layer following the RNN -- conditioning the language model by `merging' image features.

Scalable Bayesian Optimization Using Deep Neural Networks

automl/pybnn 19 Feb 2015

Bayesian optimization is an effective methodology for the global optimization of functions with expensive evaluations.

Sequence to Sequence -- Video to Text

nasib-ullah/video-captioning-models-in-Pytorch 3 May 2015

Our LSTM model is trained on video-sentence pairs and learns to associate a sequence of video frames to a sequence of words in order to generate a description of the event in the video clip.

An Actor-Critic Algorithm for Sequence Prediction

rizar/actor-critic-public 24 Jul 2016

We present an approach to training neural networks to generate sequences using actor-critic methods from reinforcement learning (RL).

Deep Reinforcement Learning For Sequence to Sequence Models

yaserkl/RLSeq2Seq 24 May 2018

In this survey, we consider seq2seq problems from the RL point of view and provide a formulation combining the power of RL methods in decision-making with sequence-to-sequence models that enable remembering long-term memories.

Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts

google-research-datasets/conceptual-12m CVPR 2021

The availability of large-scale image captioning and visual question answering datasets has contributed significantly to recent successes in vision-and-language pre-training.