Image Generation
1918 papers with code • 85 benchmarks • 67 datasets
Image Generation (synthesis) is the task of generating new images from an existing dataset.
- Unconditional generation refers to generating samples unconditionally from the dataset, i.e. $p(y)$
- Conditional image generation (subtask) refers to generating samples conditionally from the dataset, based on a label, i.e. $p(y|x)$.
In this section, you can find state-of-the-art leaderboards for unconditional generation. For conditional generation, and other types of image generations, refer to the subtasks.
( Image credit: StyleGAN )
Libraries
Use these libraries to find Image Generation models and implementationsDatasets
Subtasks
- Image-to-Image Translation
- Image Inpainting
- Text-to-Image Generation
- Conditional Image Generation
- Conditional Image Generation
- Face Generation
- Image Harmonization
- Pose Transfer
- 3D-Aware Image Synthesis
- Facial Inpainting
- Layout-to-Image Generation
- ROI-based image generation
- Image Generation from Scene Graphs
- Pose-Guided Image Generation
- User Constrained Thumbnail Generation
- Handwritten Word Generation
- Chinese Landscape Painting Generation
- person reposing
- Infinite Image Generation
- Multi class one-shot image synthesis
- Single class few-shot image synthesis
Latest papers
Self-Rectifying Diffusion Sampling with Perturbed-Attention Guidance
These techniques are often not applicable in unconditional generation or in various downstream tasks such as image restoration.
SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions
Recent advancements in diffusion models have positioned them at the forefront of image generation.
Multi-Scale Texture Loss for CT denoising with GANs
To grasp highly complex and non-linear textural relationships in the training process, this work presents a loss function that leverages the intrinsic multi-scale nature of the Gray-Level-Co-occurrence Matrix (GLCM).
Long-CLIP: Unlocking the Long-Text Capability of CLIP
Contrastive Language-Image Pre-training (CLIP) has been the cornerstone for zero-shot classification, text-image retrieval, and text-image generation by aligning image and text modalities.
Generative Active Learning for Image Synthesis Personalization
The primary challenge in conducting active learning on generative models lies in the open-ended nature of querying, which differs from the closed form of querying in discriminative models that typically target a single concept.
Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models
This approach limits the generation of segmentation masks derived from word tokens not contained in the text prompt.
Diversity-aware Channel Pruning for StyleGAN Compression
Specifically, by assessing channel importance based on their sensitivities to latent vector perturbations, our method enhances the diversity of samples in the compressed model.
IIDM: Image-to-Image Diffusion Model for Semantic Image Synthesis
Semantic image synthesis aims to generate high-quality images given semantic conditions, i. e. segmentation masks and style reference images.
Step-Calibrated Diffusion for Biomedical Optical Image Restoration
Here, we present Restorative Step-Calibrated Diffusion (RSCD), an unpaired image restoration method that views the image restoration problem as completing the finishing steps of a diffusion-based image generation task.
Towards Learning Contrast Kinetics with Multi-Condition Latent Diffusion Models
Contrast agents in dynamic contrast enhanced magnetic resonance imaging allow to localize tumors and observe their contrast kinetics, which is essential for cancer characterization and respective treatment decision-making.