Image Paragraph Captioning

5 papers with code • 1 benchmarks • 1 datasets

Image paragraph captioning involves generating a detailed, multi-sentence description of the content of an image.

Most implemented papers

A Hierarchical Approach for Generating Descriptive Image Paragraphs

chenxinpeng/im2p CVPR 2017

Recent progress on image captioning has made it possible to generate novel sentences describing images in natural language, but compressing an image into a single sentence can describe visual content in only coarse detail.

Training for Diversity in Image Paragraph Captioning

lukemelas/image-paragraph-captioning EMNLP 2018

Image paragraph captioning models aim to produce detailed descriptions of a source image.

Context-Aware Visual Policy Network for Fine-Grained Image Captioning

daqingliu/CAVP 6 Jun 2019

With the maturity of visual detection techniques, we are more ambitious in describing visual content with open-vocabulary, fine-grained and free-form language, i. e., the task of image captioning.

Matching Visual Features to Hierarchical Semantic Topics for Image Paragraph Captioning

dandanguo1993/vtcm-based-image-paragraph-caption 10 May 2021

Inspired by recent successes in integrating semantic topics into this task, this paper develops a plug-and-play hierarchical-topic-guided image paragraph generation framework, which couples a visual extractor with a deep topic model to guide the learning of a language model.

VLIS: Unimodal Language Models Guide Multimodal Language Generation

jiwanchung/vlis 15 Oct 2023

Multimodal language generation, which leverages the synergy of language and vision, is a rapidly expanding field.