TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Semantic correspondence	PF-PASCAL	SD+DINO (Supervised)	PCK	93.6	# 4
Semantic correspondence	SPair-71k	SD+DINO (Supervised)	PCK	74.6	# 3
Semantic correspondence	SPair-71k	SD+DINO (Zero-shot)	PCK	64.0	# 6

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-tale-of-two-features-stable-diffusion/semantic-correspondence-on-spair-71k)](https://paperswithcode.com/sota/semantic-correspondence-on-spair-71k?p=a-tale-of-two-features-stable-diffusion)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-tale-of-two-features-stable-diffusion/semantic-correspondence-on-pf-pascal)](https://paperswithcode.com/sota/semantic-correspondence-on-pf-pascal?p=a-tale-of-two-features-stable-diffusion)`

A Tale of Two Features: Stable Diffusion Complements DINO for Zero-Shot Semantic Correspondence

NeurIPS 2023 · Junyi Zhang, Charles Herrmann, Junhwa Hur, Luisa Polania Cabrera, Varun Jampani, Deqing Sun, Ming-Hsuan Yang ·

Text-to-image diffusion models have made significant advances in generating and editing high-quality images. As a result, numerous approaches have explored the ability of diffusion model features to understand and process single images for downstream tasks, e.g., classification, semantic segmentation, and stylization. However, significantly less is known about what these features reveal across multiple, different images and objects. In this work, we exploit Stable Diffusion (SD) features for semantic and dense correspondence and discover that with simple post-processing, SD features can perform quantitatively similar to SOTA representations. Interestingly, the qualitative analysis reveals that SD features have very different properties compared to existing representation learning features, such as the recently released DINOv2: while DINOv2 provides sparse but accurate matches, SD features provide high-quality spatial information but sometimes inaccurate semantic matches. We demonstrate that a simple fusion of these two features works surprisingly well, and a zero-shot evaluation using nearest neighbors on these fused features provides a significant performance gain over state-of-the-art methods on benchmark datasets, e.g., SPair-71k, PF-Pascal, and TSS. We also show that these correspondences can enable interesting applications such as instance swapping in two images.

PDF Abstract NeurIPS 2023 PDF NeurIPS 2023 Abstract

Code

Add Remove Mark official

Junyi42/sd-dino official

208

Tasks

Add Remove

Representation Learning

Semantic correspondence

Semantic Segmentation

Datasets

SPair-71k

PF-PASCAL

Results from the Paper

Edit

Ranked #3 on Semantic correspondence on SPair-71k

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Semantic correspondence	PF-PASCAL	SD+DINO (Supervised)	PCK	93.6	# 4	Compare
Semantic correspondence	SPair-71k	SD+DINO (Supervised)	PCK	74.6	# 3	Compare
Semantic correspondence	SPair-71k	SD+DINO (Zero-shot)	PCK	64.0	# 6	Compare

Methods

Add Remove

Diffusion

Edit Social Preview

A Tale of Two Features: Stable Diffusion Complements DINO for Zero-Shot Semantic Correspondence

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove