Story Visualization

20 papers with code • 3 benchmarks • 1 datasets

Story Visualization is the task of generating coherent and aligned sequence of images given a sequence of textual captions representing description of a story. It mainly consists of two tasks: story generation and story continuation, where story continuation uses additional ground truth information in the form of the first frame.

Benchmarks

Add a Result

These leaderboards are used to track progress in Story Visualization

Dataset	Best Model	Compare
Pororo	AR-LDM	See all
CLEVR-SV	Impartial Transformer	See all
Zero-Shot Action Execution DiDeMO-CSV	Phenaki-Gen	See all

Datasets

StoryBench

Most implemented papers

Most implemented Social Latest No code

Make-A-Story: Visual Memory Conditioned Consistent Story Generation

ubc-vision/make-a-story • • CVPR 2023

Our experiments for story generation on the MUGEN, the PororoSV and the FlintstonesSV dataset show that our method not only outperforms prior state-of-the-art in generating frames with high visual quality, which are consistent with the story, but also models appropriate correspondences between the characters and the background.

Paper
Code

TaleCrafter: Interactive Story Visualization with Multiple Characters

videocrafter/talecrafter • 29 May 2023

Accurate Story visualization requires several necessary elements, such as identity consistency across frames, the alignment between plain text and visual content, and a reasonable layout of objects in images.

Paper
Code

Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion Models

haoningwu3639/StoryGen • • 1 Jun 2023

Generative models have recently exhibited exceptional capabilities in text-to-image generation, but still struggle to generate image sequences coherently.

Paper
Code

Story Visualization by Online Text Augmentation with Context Memory

yonseivnl/cmota • • ICCV 2023

Story visualization (SV) is a challenging text-to-image generation task for the difficulty of not only rendering visual details from the text descriptions but also encoding a long-term context across multiple sentences.

Paper
Code

The Chosen One: Consistent Characters in Text-to-Image Diffusion Models

ZichengDuan/TheChosenOne • • 16 Nov 2023

Our quantitative analysis demonstrates that our method strikes a better balance between prompt alignment and identity consistency compared to the baseline methods, and these findings are reinforced by a user study.

Paper
Code

StoryGPT-V: Large Language Models as Consistent Story Visualizers

xiaoqian-shen/StoryGPT-V • • 4 Dec 2023

Therefore, we introduce \textbf{StoryGPT-V}, which leverages the merits of the latent diffusion (LDM) and LLM to produce images with consistent and high-quality characters grounded on given story descriptions.

Paper
Code

Training-Free Consistent Text-to-Image Generation

kousw/experimental-consistory • • 5 Feb 2024

Text-to-image models offer a new level of creative flexibility by allowing users to guide the image generation process through natural language.

Paper
Code

Masked Generative Story Transformer with Character Guidance and Caption Augmentation

chrispapa2000/maskgst • • 13 Mar 2024

Story Visualization (SV) is a challenging generative vision task, that requires both visual quality and consistency between different frames in generated image sequences.

Paper
Code

Story Visualization

Benchmarks Add a Result

Datasets

Most implemented papers

Content

Benchmarks

Add a Result