Text-to-Image Generation

276 papers with code • 11 benchmarks • 18 datasets

عکس یه بخیه در حال خون ریزی

Libraries

Use these libraries to find Text-to-Image Generation models and implementations

Most implemented papers

MaskGIT: Masked Generative Image Transformer

google-research/maskgit CVPR 2022

At inference time, the model begins with generating all tokens of an image simultaneously, and then refines the image iteratively conditioned on the previous generation.

TediGAN: Text-Guided Diverse Face Image Generation and Manipulation

weihaox/TediGAN CVPR 2021

In this work, we propose TediGAN, a novel framework for multi-modal image generation and manipulation with textual descriptions.

DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-to-Image Synthesis

MinfengZhu/DM-GAN CVPR 2019

If the initial image is not well initialized, the following processes can hardly refine the image to a satisfactory quality.

CogView: Mastering Text-to-Image Generation via Transformers

THUDM/CogView NeurIPS 2021

Text-to-Image generation in the general domain has long been an open problem, which requires both a powerful generative model and cross-modal understanding.

OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

ofa-sys/ofa 7 Feb 2022

In this work, we pursue a unified paradigm for multimodal pretraining to break the scaffolds of complex task/modality-specific customization.

A Novel Sampling Scheme for Text- and Image-Conditional Image Synthesis in Quantized Latent Spaces

dome272/paella 14 Nov 2022

Recent advancements in the domain of text-to-image synthesis have culminated in a multitude of enhancements pertaining to quality, fidelity, and diversity.

Muse: Text-To-Image Generation via Masked Generative Transformers

lucidrains/muse-pytorch 2 Jan 2023

Compared to pixel-space diffusion models, such as Imagen and DALL-E 2, Muse is significantly more efficient due to the use of discrete tokens and requiring fewer sampling iterations; compared to autoregressive models, such as Parti, Muse is more efficient due to the use of parallel decoding.

Tell, Draw, and Repeat: Generating and Modifying Images Based on Continual Linguistic Instruction

Maluuba/GeNeVA ICCV 2019

Conditional text-to-image generation is an active area of research, with many possible applications.

ManiGAN: Text-Guided Image Manipulation

mrlibw/ManiGAN 12 Dec 2019

The goal of our paper is to semantically edit parts of an image matching a given text that describes desired attributes (e. g., texture, colour, and background), while preserving other contents that are irrelevant to the text.

Conditional Image Generation and Manipulation for User-Specified Content

IIGROUP/Multi-Modal-CelebA-HQ-Dataset 11 May 2020

This can be done by conditioning the model on additional information.