Temporal Modulation Network for Controllable Space-Time Video Super-Resolution

Space-time video super-resolution (STVSR) aims to increase the spatial and temporal resolutions of low-resolution and low-frame-rate videos. Recently, deformable convolution based methods have achieved promising STVSR performance, but they could only infer the intermediate frame pre-defined in the training stage. Besides, these methods undervalued the short-term motion cues among adjacent frames. In this paper, we propose a Temporal Modulation Network (TMNet) to interpolate arbitrary intermediate frame(s) with accurate high-resolution reconstruction. Specifically, we propose a Temporal Modulation Block (TMB) to modulate deformable convolution kernels for controllable feature interpolation. To well exploit the temporal information, we propose a Locally-temporal Feature Comparison (LFC) module, along with the Bi-directional Deformable ConvLSTM, to extract short-term and long-term motion cues in videos. Experiments on three benchmark datasets demonstrate that our TMNet outperforms previous STVSR methods. The code is available at https://github.com/CS-GangXu/TMNet.

PDF Abstract CVPR 2021 PDF CVPR 2021 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Video Super-Resolution MSU Super-Resolution for Video Compression TMNet + x264 BSQ-rate over ERQA 1.879 # 14
BSQ-rate over VMAF 1.061 # 18
BSQ-rate over PSNR 1.481 # 11
BSQ-rate over MS-SSIM 0.844 # 15
BSQ-rate over LPIPS 1.377 # 21
Video Super-Resolution MSU Super-Resolution for Video Compression TMNet + aomenc BSQ-rate over ERQA 21.798 # 81
BSQ-rate over VMAF 4.667 # 62
BSQ-rate over PSNR 15.144 # 73
BSQ-rate over MS-SSIM 10.322 # 80
BSQ-rate over LPIPS 6.276 # 50
Video Super-Resolution MSU Super-Resolution for Video Compression TMNet + vvenc BSQ-rate over ERQA 21.303 # 80
BSQ-rate over VMAF 1.795 # 40
BSQ-rate over PSNR 9.43 # 51
BSQ-rate over MS-SSIM 1.813 # 35
BSQ-rate over LPIPS 13.988 # 79
Video Super-Resolution MSU Super-Resolution for Video Compression TMNet + x265 BSQ-rate over ERQA 13.577 # 62
BSQ-rate over VMAF 2.009 # 46
BSQ-rate over PSNR 7.046 # 40
BSQ-rate over MS-SSIM 1.735 # 34
BSQ-rate over LPIPS 13.485 # 76
Video Super-Resolution MSU Super-Resolution for Video Compression TMNet + uavs3e BSQ-rate over ERQA 13.187 # 56
BSQ-rate over VMAF 3.487 # 56
BSQ-rate over PSNR 15.144 # 73
BSQ-rate over MS-SSIM 4.317 # 51
BSQ-rate over LPIPS 5.015 # 46
Video Super-Resolution MSU Video Super Resolution Benchmark: Detail Restoration TMNet Subjective score 6 # 8
ERQAv1.0 0.712 # 9
QRCRv1.0 0.549 # 14
SSIM 0.885 # 7
PSNR 30.364 # 7
FPS 1.136 # 12
1 - LPIPS 0.931 # 3

Methods