TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Conditional Text-to-Image Synthesis	COCO 2017 val	LCDG	FID	20.27	# 1
Conditional Text-to-Image Synthesis	COCO 2017 val	T2I-Adapter (Color, evaluated under color stroke)	FID	30.84	# 9
Conditional Text-to-Image Synthesis	COCO 2017 val	SD using SDEdit (evaluated under image palette)	CLIP Score	0.2138	# 8
Conditional Text-to-Image Synthesis	COCO 2017 val	SD using SDEdit (evaluated under color stroke)	FID	32.93	# 10
Conditional Text-to-Image Synthesis	COCO 2017 val	SD using SDEdit (evaluated under color stroke)	CLIP Score	0.2257	# 7
Conditional Text-to-Image Synthesis	COCO 2017 val	LCDG (Color, evaluated under image palette)	FID	20.61	# 2
Conditional Text-to-Image Synthesis	COCO 2017 val	LCDG (Color, evaluated under image palette)	CLIP Score	0.2580	# 4
Conditional Text-to-Image Synthesis	COCO 2017 val	T2I-Adapter (Color, evaluated under image palette)	FID	26.54	# 6
Conditional Text-to-Image Synthesis	COCO 2017 val	T2I-Adapter (Color, evaluated under image palette)	CLIP Score	0.2613	# 3
Conditional Text-to-Image Synthesis	COCO 2017 val	LCDG (Mask)	FID	20.94	# 3
Conditional Text-to-Image Synthesis	COCO 2017 val	LCDG (Mask)	CLIP Score	0.2617	# 2
Conditional Text-to-Image Synthesis	COCO 2017 val	SD (text)	FID	27.99	# 7
Conditional Text-to-Image Synthesis	COCO 2017 val	SD (text)	CLIP Score	0.2673	# 1
Conditional Text-to-Image Synthesis	COCO 2017 val	SD using SDEdit	FID	71.16	# 11
Conditional Text-to-Image Synthesis	COCO 2017 val	LCDG (Edge)	FID	21.02	# 4
Conditional Text-to-Image Synthesis	COCO 2017 val	T2I-Adapter (Sketch)	FID	21.72	# 5
Conditional Text-to-Image Synthesis	COCO 2017 val	T2I-Adapter (Sketch)	CLIP Score	0.2580	# 4
Conditional Text-to-Image Synthesis	COCO 2017 val	ControlNet (HED Edge)	FID	28.09	# 8
Conditional Text-to-Image Synthesis	COCO 2017 val	ControlNet (HED Edge)	CLIP Score	0.2525	# 6

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/late-constraint-diffusion-guidance-for/conditional-text-to-image-synthesis-on-coco)](https://paperswithcode.com/sota/conditional-text-to-image-synthesis-on-coco?p=late-constraint-diffusion-guidance-for)`

Late-Constraint Diffusion Guidance for Controllable Image Synthesis

19 May 2023 · Chang Liu, Dong Liu ·

Diffusion models, either with or without text condition, have demonstrated impressive capability in synthesizing photorealistic images given a few or even no words. These models may not fully satisfy user need, as normal users or artists intend to control the synthesized images with specific guidance, like overall layout, color, structure, object shape, and so on. To adapt diffusion models for controllable image synthesis, several methods have been proposed to incorporate the required conditions as regularization upon the intermediate features of the diffusion denoising network. These methods, known as early-constraint ones in this paper, have difficulties in handling multiple conditions with a single solution. They intend to train separate models for each specific condition, which require much training cost and result in non-generalizable solutions. To address these difficulties, we propose a new approach namely late-constraint: we leave the diffusion networks unchanged, but constrain its output to be aligned with the required conditions. Specifically, we train a lightweight condition adapter to establish the correlation between external conditions and internal representations of diffusion models. During the iterative denoising process, the conditional guidance is sent into corresponding condition adapter to manipulate the sampling process with the established correlation. We further equip the introduced late-constraint strategy with a timestep resampling method and an early stopping technique, which boost the quality of synthesized image meanwhile complying with the guidance. Our method outperforms the existing early-constraint methods and generalizes better to unseen condition. Our code would be available.

PDF Abstract

Code

Add Remove Mark official

AlonzoLeeeooo/LCDG official

Tasks

Add Remove

Conditional Image Generation

Conditional Text-to-Image Synthesis

Image Generation

Text-to-Image Generation

Datasets

MS COCO

CelebA-HQ

Results from the Paper

Edit

Ranked #1 on Conditional Text-to-Image Synthesis on COCO 2017 val

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Conditional Text-to-Image Synthesis	COCO 2017 val	LCDG	FID	20.27	# 1	Compare
Conditional Text-to-Image Synthesis	COCO 2017 val	T2I-Adapter (Color, evaluated under color stroke)	FID	30.84	# 9	Compare
Conditional Text-to-Image Synthesis	COCO 2017 val	SD using SDEdit (evaluated under image palette)	CLIP Score	0.2138	# 8	Compare
Conditional Text-to-Image Synthesis	COCO 2017 val	SD using SDEdit (evaluated under color stroke)	FID	32.93	# 10	Compare
Conditional Text-to-Image Synthesis	COCO 2017 val	SD using SDEdit (evaluated under color stroke)	CLIP Score	0.2257	# 7	Compare
Conditional Text-to-Image Synthesis	COCO 2017 val	LCDG (Color, evaluated under image palette)	FID	20.61	# 2	Compare
Conditional Text-to-Image Synthesis	COCO 2017 val	LCDG (Color, evaluated under image palette)	CLIP Score	0.2580	# 4	Compare
Conditional Text-to-Image Synthesis	COCO 2017 val	T2I-Adapter (Color, evaluated under image palette)	FID	26.54	# 6	Compare
Conditional Text-to-Image Synthesis	COCO 2017 val	T2I-Adapter (Color, evaluated under image palette)	CLIP Score	0.2613	# 3	Compare
Conditional Text-to-Image Synthesis	COCO 2017 val	LCDG (Mask)	FID	20.94	# 3	Compare
Conditional Text-to-Image Synthesis	COCO 2017 val	LCDG (Mask)	CLIP Score	0.2617	# 2	Compare
Conditional Text-to-Image Synthesis	COCO 2017 val	SD (text)	FID	27.99	# 7	Compare
Conditional Text-to-Image Synthesis	COCO 2017 val	SD (text)	CLIP Score	0.2673	# 1	Compare
Conditional Text-to-Image Synthesis	COCO 2017 val	SD using SDEdit	FID	71.16	# 11	Compare
Conditional Text-to-Image Synthesis	COCO 2017 val	LCDG (Edge)	FID	21.02	# 4	Compare
Conditional Text-to-Image Synthesis	COCO 2017 val	T2I-Adapter (Sketch)	FID	21.72	# 5	Compare
Conditional Text-to-Image Synthesis	COCO 2017 val	T2I-Adapter (Sketch)	CLIP Score	0.2580	# 4	Compare
Conditional Text-to-Image Synthesis	COCO 2017 val	ControlNet (HED Edge)	FID	28.09	# 8	Compare
Conditional Text-to-Image Synthesis	COCO 2017 val	ControlNet (HED Edge)	CLIP Score	0.2525	# 6	Compare

Methods

Add Remove

Adapter • Diffusion • Early Stopping

Edit Social Preview

Late-Constraint Diffusion Guidance for Controllable Image Synthesis

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove