TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Video Generation	UCF-101	VideoFusion (128x128, class-conditional)	Inception Score	80.03	# 4
Video Generation	UCF-101	VideoFusion (128x128, class-conditional)	FVD16	173	# 7
Video Generation	UCF-101	VideoFusion (128x128, unconditional)	Inception Score	72.22	# 7
Video Generation	UCF-101	VideoFusion (128x128, unconditional)	FVD16	220	# 8

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/decomposed-diffusion-models-for-high-quality/video-generation-on-ucf-101)](https://paperswithcode.com/sota/video-generation-on-ucf-101?p=decomposed-diffusion-models-for-high-quality)`

VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation

CVPR 2023 · Zhengxiong Luo, Dayou Chen, Yingya Zhang, Yan Huang, Liang Wang, Yujun Shen, Deli Zhao, Jingren Zhou, Tieniu Tan ·

A diffusion probabilistic model (DPM), which constructs a forward diffusion process by gradually adding noise to data points and learns the reverse denoising process to generate new samples, has been shown to handle complex data distribution. Despite its recent success in image synthesis, applying DPMs to video generation is still challenging due to high-dimensional data spaces. Previous methods usually adopt a standard diffusion process, where frames in the same video clip are destroyed with independent noises, ignoring the content redundancy and temporal correlation. This work presents a decomposed diffusion process via resolving the per-frame noise into a base noise that is shared among all frames and a residual noise that varies along the time axis. The denoising pipeline employs two jointly-learned networks to match the noise decomposition accordingly. Experiments on various datasets confirm that our approach, termed as VideoFusion, surpasses both GAN-based and diffusion-based alternatives in high-quality video generation. We further show that our decomposed formulation can benefit from pre-trained image diffusion models and well-support text-conditioned video creation.

PDF Abstract CVPR 2023 PDF CVPR 2023 Abstract

Code

Add Remove Mark official

modelscope/modelscope official

6,067

Tasks

Add Remove

Code Generation

Denoising

Image Generation

Text-to-Video Generation

Video Generation

Vocal Bursts Intensity Prediction

Datasets

UCF101

WebVid

Results from the Paper

Edit

Ranked #7 on Video Generation on UCF-101

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Video Generation	UCF-101	VideoFusion (128x128, class-conditional)	Inception Score	80.03	# 4	Compare
Video Generation	UCF-101	VideoFusion (128x128, class-conditional)	FVD16	173	# 7	Compare
Video Generation	UCF-101	VideoFusion (128x128, unconditional)	Inception Score	72.22	# 7	Compare
Video Generation	UCF-101	VideoFusion (128x128, unconditional)	FVD16	220	# 8	Compare

Methods

Add Remove

BASE • CLIP • Diffusion

Edit Social Preview

VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove