4 dataset results for Visual Storytelling AND Images

The Visual Storytelling Dataset (VIST) consists of 210,819 unique photos and 50,000 stories. The images were collected from albums on Flickr. The albums included 10 to 50 images and all the images in an album are taken in a 48-hour span. The stories were created by workers on Amazon Mechanical Turk, where the workers were instructed to choose five images from the album and write a story about them. Every story has five sentences, and every sentence is paired with its appropriate image. The dataset is split into 3 subsets, a training set (80%), a validation set (10%) and a test set (10%). All the words and interpunction signs in the stories are separated by a space character and all the location names are replaced with the word location. All the names of people are replaced with the words male or female depending on the gender of the person.

99 PAPERS • 2 BENCHMARKS

VIST-Edit

The dataset, VIST-Edit, includes 14,905 human-edited versions of 2,981 machine-generated visual stories. The stories were generated by two state-of-the-art visual storytelling models, each aligned to 5 human-edited versions.

2 PAPERS • NO BENCHMARKS YET

Creative Visual Storytelling Anthology (ARL Creative Visual Storytelling Anthology)

The Creative Visual Storytelling Anthology is a collection of 100 author responses to an improved creative visual storytelling exercise over a sequence of three images. Each item contains four facet entries, corresponding to Entity, Scene, Narrative, and Title.

1 PAPER • NO BENCHMARKS YET

Visual Writing Prompts

Hugging Face Datasets (New!) | Website | Github Repository | arXiv e-Print

1 PAPER • NO BENCHMARKS YET