Text-to-Image Generation

275 papers with code • 11 benchmarks • 18 datasets

Text-to-Image Generation is a task in computer vision and natural language processing where the goal is to generate an image that corresponds to a given textual description. This involves converting the text input into a meaningful representation, such as a feature vector, and then using this representation to generate an image that matches the description.

Benchmarks

Add a Result

These leaderboards are used to track progress in Text-to-Image Generation

Dataset	Best Model	Compare
MS COCO	Parti Finetuned	See all
CUB	TLDM	See all
Multi-Modal-CelebA-HQ	Swinv2-Imagen	See all
Oxford 102 Flowers	VQ-Diffusion-F	See all
Conceptual Captions	Contextual RQ-Transformer	See all
LHQC	NUWA-Infinity	See all
MS-COCO	AttnGAN	See all
GeNeVA (CoDraw)	LatteGAN	See all
GeNeVA (i-CLEVR)	LatteGAN	See all
LAION COCO	Parti Finetuned	See all
Colors	BiLSTMS on color generation	See all

Show all 11 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Text-to-Image Generation models and implementations

faceonlive/ai-research

4 papers

124

hanzhanggit/StackGAN

3 papers

1,849

kakaobrain/rq-vae-transformer

3 papers

683

hanzhanggit/StackGAN-Pytorch

3 papers

479

See all 17 libraries.

Datasets

Subtasks

Concept Alignment

Conditional Text-to-Image Synthesis

Consistent Character Generation

DreamBooth Personalized Generation

Latest papers with no code

Most implemented Social Latest No code

MaxFusion: Plug&Play Multi-Modal Generation in Text-to-Image Diffusion Models

no code yet • 15 Apr 2024

Large diffusion-based Text-to-Image (T2I) models have shown impressive generative powers for text-to-image generation as well as spatially conditioned image generation.

Paper
Add Code

DreamScape: 3D Scene Creation via Gaussian Splatting joint Correlation Modeling

no code yet • 14 Apr 2024

Recent progress in text-to-3D creation has been propelled by integrating the potent prior of Diffusion Models from text-to-image generation into the 3D domain.

Paper
Add Code

Mask-ControlNet: Higher-Quality Image Generation with An Additional Mask Prompt

no code yet • 8 Apr 2024

Then, the object images are employed as additional prompts to facilitate the diffusion model to better understand the relationship between foreground and background regions during image generation.

Paper
Add Code

Concept Weaver: Enabling Multi-Concept Fusion in Text-to-Image Models

no code yet • 5 Apr 2024

While there has been significant progress in customizing text-to-image generation models, generating images that combine multiple personalized concepts remains challenging.

Paper
Add Code

Diverse and Tailored Image Generation for Zero-shot Multi-label Classification

no code yet • 4 Apr 2024

Our approach introduces a novel image generation framework that produces multi-label synthetic images of unseen classes for classifier training.

Paper
Add Code

On the Scalability of Diffusion-based Text-to-Image Generation

no code yet • 3 Apr 2024

On the data scaling side, we show the quality and diversity of the training set matters more than simply dataset size.

Paper
Add Code

MatAtlas: Text-driven Consistent Geometry Texturing and Material Assignment

no code yet • 3 Apr 2024

We present MatAtlas, a method for consistent text-guided 3D model texturing.

Paper
Add Code

MULAN: A Multi Layer Annotated Dataset for Controllable Text-to-Image Generation

no code yet • 3 Apr 2024

To build MuLAn, we developed a training free pipeline which decomposes a monocular RGB image into a stack of RGBA layers comprising of background and isolated instances.

Paper
Add Code

Survey of Bias In Text-to-Image Generation: Definition, Evaluation, and Mitigation

no code yet • 1 Apr 2024

In this survey, we review prior studies on dimensions of bias: Gender, Skintone, and Geo-Culture.

Paper
Add Code

Condition-Aware Neural Network for Controlled Image Generation

no code yet • 1 Apr 2024

In parallel to prior conditional control methods, CAN controls the image generation process by dynamically manipulating the weight of the neural network.

Paper
Add Code

Text-to-Image Generation

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result