4 dataset results for Image Generation AND Texts AND English

Multi-Modal-CelebA-HQ is a large-scale face image dataset that has 30,000 high-resolution face images selected from the CelebA dataset by following CelebA-HQ. Each image has high-quality segmentation mask, sketch, descriptive text, and image with transparent background.

27 PAPERS • 3 BENCHMARKS

MatSynth

MatSynth MatSynth is a Physically Based Rendering (PBR) materials dataset designed for modern AI applications. This dataset consists of over 4,000 ultra-high resolution, offering unparalleled scale, diversity, and detail.

2 PAPERS • NO BENCHMARKS YET

WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images

WHOOPS! Is a dataset and benchmark for visual commonsense. The dataset is comprised of purposefully commonsense-defying images created by designers using publicly-available image generation tools like Midjourney. It contains commonsense-defying image from a wide range of reasons, deviations from expected social norms and everyday knowledge.

2 PAPERS • 4 BENCHMARKS

ENTIGEN (Ethical NaTural Language Interventions in Text-to-Image GENeration)

ENTIGEN is a benchmark dataset to evaluate the change in image generations conditional on ethical interventions across three social axes -- gender, skin color, and culture. It contains 246 prompts based on an attribute set containing diverse professions, objects, and cultural scenarios.

1 PAPER • NO BENCHMARKS YET

Datasets

4 dataset results for Image Generation AND Texts AND English