ESP (Evaluation for Styled Prompt)

Introduced by Yu et al. in Multimodal Knowledge Alignment with Reinforcement Learning

ESP dataset (Evaluation for Styled Prompt dataset) is a benchmark for zero-shot domain-conditional caption generation. ESP is a new dataset focusing on providing multiple styled text targets for the same image. It comprises 4.8k captions from 1k images in the COCO Captions test set. We collect five text domains with everyday usage: blog, social media, instruction, story, and news.

Homepage

Benchmarks

Add a new result Link an existing benchmark

No benchmarks yet. Start a new benchmark or link an existing one.

Papers

Paper	Code	Results	Date	Stars

Dataset Loaders

Add Remove

No data loaders found. You can submit your data loader here.

Tasks

Image Captioning

Usage

License

Unknown

Modalities

Images
Texts

Languages

English

ESP (Evaluation for Styled Prompt)

Benchmarks Edit Add a new result Link an existing benchmark

Papers

Dataset Loaders Edit Add Remove

Tasks Edit