Story Visualization

20 papers with code • 3 benchmarks • 1 datasets

Story Visualization is the task of generating coherent and aligned sequence of images given a sequence of textual captions representing description of a story. It mainly consists of two tasks: story generation and story continuation, where story continuation uses additional ground truth information in the form of the first frame.

Datasets


StoryImager: A Unified and Efficient Framework for Coherent Story Visualization and Completion

faceonlive/ai-research 9 Apr 2024

3) The story visualization and continuation models are trained and inferred independently, which is not user-friendly.

198
09 Apr 2024

Masked Generative Story Transformer with Character Guidance and Caption Augmentation

chrispapa2000/maskgst 13 Mar 2024

Story Visualization (SV) is a challenging generative vision task, that requires both visual quality and consistency between different frames in generated image sequences.

1
13 Mar 2024

Training-Free Consistent Text-to-Image Generation

kousw/experimental-consistory 5 Feb 2024

Text-to-image models offer a new level of creative flexibility by allowing users to guide the image generation process through natural language.

47
05 Feb 2024

StoryGPT-V: Large Language Models as Consistent Story Visualizers

xiaoqian-shen/StoryGPT-V 4 Dec 2023

Therefore, we introduce \textbf{StoryGPT-V}, which leverages the merits of the latent diffusion (LDM) and LLM to produce images with consistent and high-quality characters grounded on given story descriptions.

32
04 Dec 2023

The Chosen One: Consistent Characters in Text-to-Image Diffusion Models

ZichengDuan/TheChosenOne 16 Nov 2023

Our quantitative analysis demonstrates that our method strikes a better balance between prompt alignment and identity consistency compared to the baseline methods, and these findings are reinforced by a user study.

200
16 Nov 2023

Story Visualization by Online Text Augmentation with Context Memory

yonseivnl/cmota ICCV 2023

Story visualization (SV) is a challenging text-to-image generation task for the difficulty of not only rendering visual details from the text descriptions but also encoding a long-term context across multiple sentences.

7
15 Aug 2023

Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion Models

haoningwu3639/StoryGen 1 Jun 2023

Generative models have recently exhibited exceptional capabilities in text-to-image generation, but still struggle to generate image sequences coherently.

153
01 Jun 2023

TaleCrafter: Interactive Story Visualization with Multiple Characters

videocrafter/talecrafter 29 May 2023

Accurate Story visualization requires several necessary elements, such as identity consistency across frames, the alignment between plain text and visual content, and a reasonable layout of objects in images.

239
29 May 2023

Make-A-Story: Visual Memory Conditioned Consistent Story Generation

ubc-vision/make-a-story CVPR 2023

Our experiments for story generation on the MUGEN, the PororoSV and the FlintstonesSV dataset show that our method not only outperforms prior state-of-the-art in generating frames with high visual quality, which are consistent with the story, but also models appropriate correspondences between the characters and the background.

34
23 Nov 2022

Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models

xichenpan/ARLDM 20 Nov 2022

Conditioned diffusion models have demonstrated state-of-the-art text-to-image synthesis capacity.

177
20 Nov 2022