TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Video Frame Interpolation	MSU Video Frame Interpolation	Super-SloMo	Subjective score	1.11	# 3
Video Frame Interpolation	MSU Video Frame Interpolation	Super-SloMo	PSNR	26.69	# 17
Video Frame Interpolation	MSU Video Frame Interpolation	Super-SloMo	SSIM	0.904	# 17
Video Frame Interpolation	MSU Video Frame Interpolation	Super-SloMo	VMAF	61.35	# 17
Video Frame Interpolation	MSU Video Frame Interpolation	Super-SloMo	LPIPS	0.068	# 19
Video Frame Interpolation	MSU Video Frame Interpolation	Super-SloMo	MS-SSIM	0.924	# 17
Video Frame Interpolation	MSU Video Frame Interpolation	Super-SloMo	FPS	3.1	# 3

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/super-slomo-high-quality-estimation-of/video-frame-interpolation-on-msu-video-frame)](https://paperswithcode.com/sota/video-frame-interpolation-on-msu-video-frame?p=super-slomo-high-quality-estimation-of)`

Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation

CVPR 2018 · Huaizu Jiang, Deqing Sun, Varun Jampani, Ming-Hsuan Yang, Erik Learned-Miller, Jan Kautz ·

Given two consecutive frames, video interpolation aims at generating intermediate frame(s) to form both spatially and temporally coherent video sequences. While most existing methods focus on single-frame interpolation, we propose an end-to-end convolutional neural network for variable-length multi-frame video interpolation, where the motion interpretation and occlusion reasoning are jointly modeled. We start by computing bi-directional optical flow between the input images using a U-Net architecture. These flows are then linearly combined at each time step to approximate the intermediate bi-directional optical flows. These approximate flows, however, only work well in locally smooth regions and produce artifacts around motion boundaries. To address this shortcoming, we employ another U-Net to refine the approximated flow and also predict soft visibility maps. Finally, the two input images are warped and linearly fused to form each intermediate frame. By applying the visibility maps to the warped images before fusion, we exclude the contribution of occluded pixels to the interpolated intermediate frame to avoid artifacts. Since none of our learned network parameters are time-dependent, our approach is able to produce as many intermediate frames as needed. We use 1,132 video clips with 240-fps, containing 300K individual video frames, to train our network. Experimental results on several datasets, predicting different numbers of interpolated frames, demonstrate that our approach performs consistently better than existing methods.

PDF Abstract CVPR 2018 PDF CVPR 2018 Abstract

Code

Add Remove Mark official

avinashpaliwal/Super-SloMo

2,971

NVIDIA/unsupervised-video-interpola…

107

rmalav15/Super-SloMo

susomena/DeepSlowMotion

manastahir/Nvidia--SuperSlowMo-Keras

Tasks

Add Remove

Optical Flow Estimation

Video Frame Interpolation

Vocal Bursts Intensity Prediction

Datasets

KITTI

UCF101

MSU Video Frame Interpolation SlowFlow

Results from the Paper

Edit

Ranked #3 on Video Frame Interpolation on MSU Video Frame Interpolation

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Video Frame Interpolation	MSU Video Frame Interpolation	Super-SloMo	Subjective score	1.11	# 3	Compare
			PSNR	26.69	# 17	Compare
			SSIM	0.904	# 17	Compare
			VMAF	61.35	# 17	Compare
			LPIPS	0.068	# 19	Compare
			MS-SSIM	0.924	# 17	Compare
			FPS	3.1	# 3	Compare

Methods

Add Remove

Concatenated Skip Connection • Convolution • Max Pooling • ReLU • U-Net

Edit Social Preview

Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove