TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Novel View Synthesis	ACID	impl.-nodepth	FID	42.88	# 1
Novel View Synthesis	ACID	impl.-catdepth	SSIM	0.42	# 1
Novel View Synthesis	ACID	hybrid	NLL	5.341	# 1
Novel View Synthesis	ACID	hybrid	PSIM	2.83	# 1
Novel View Synthesis	ACID	hybrid	PSNR	15.54	# 1
Novel View Synthesis	RealEstate10K	Hybrid	FID	48.84	# 1
Novel View Synthesis	RealEstate10K	Hybrid	PSNR	12.51	# 1
Novel View Synthesis	RealEstate10K	Impl.-depth	NLL	4.836	# 1
Novel View Synthesis	RealEstate10K	Impl.-depth	PSIM	3.05	# 1
Novel View Synthesis	RealEstate10K	Impl.-depth	SSIM	0.44	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/geometry-free-view-synthesis-transformers-and/novel-view-synthesis-on-acid)](https://paperswithcode.com/sota/novel-view-synthesis-on-acid?p=geometry-free-view-synthesis-transformers-and)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/geometry-free-view-synthesis-transformers-and/novel-view-synthesis-on-realestate10k)](https://paperswithcode.com/sota/novel-view-synthesis-on-realestate10k?p=geometry-free-view-synthesis-transformers-and)`

Geometry-Free View Synthesis: Transformers and no 3D Priors

ICCV 2021 · Robin Rombach, Patrick Esser, Björn Ommer ·

Is a geometric model required to synthesize novel views from a single image? Being bound to local convolutions, CNNs need explicit 3D biases to model geometric transformations. In contrast, we demonstrate that a transformer-based model can synthesize entirely novel views without any hand-engineered 3D biases. This is achieved by (i) a global attention mechanism for implicitly learning long-range 3D correspondences between source and target views, and (ii) a probabilistic formulation necessary to capture the ambiguity inherent in predicting novel views from a single image, thereby overcoming the limitations of previous approaches that are restricted to relatively small viewpoint changes. We evaluate various ways to integrate 3D priors into a transformer architecture. However, our experiments show that no such geometric priors are required and that the transformer is capable of implicitly learning 3D relationships between images. Furthermore, this approach outperforms the state of the art in terms of visual quality while covering the full distribution of possible realizations. Code is available at https://git.io/JOnwn

PDF Abstract ICCV 2021 PDF ICCV 2021 Abstract

Code

Add Remove Mark official

CompVis/geometry-free-view-synthesis official

↳ Quickstart in

Colab

361

Tasks

Add Remove

Novel View Synthesis

Datasets

ACID

RealEstate10K

Results from the Paper

Edit

Ranked #1 on Novel View Synthesis on RealEstate10K

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Novel View Synthesis	ACID	impl.-nodepth	FID	42.88	# 1	Compare
Novel View Synthesis	ACID	impl.-catdepth	SSIM	0.42	# 1	Compare
Novel View Synthesis	ACID	hybrid	NLL	5.341	# 1	Compare
			PSIM	2.83	# 1	Compare
			PSNR	15.54	# 1	Compare
Novel View Synthesis	RealEstate10K	Hybrid	FID	48.84	# 1	Compare
Novel View Synthesis	RealEstate10K	Hybrid	PSNR	12.51	# 1	Compare
Novel View Synthesis	RealEstate10K	Impl.-depth	NLL	4.836	# 1	Compare
			PSIM	3.05	# 1	Compare
			SSIM	0.44	# 1	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Geometry-Free View Synthesis: Transformers and no 3D Priors

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove