IFRNet: Intermediate Feature Refine Network for Efficient Frame Interpolation

Prevailing video frame interpolation algorithms, that generate the intermediate frames from consecutive inputs, typically rely on complex model architectures with heavy parameters or large delay, hindering them from diverse real-time applications. In this work, we devise an efficient encoder-decoder based network, termed IFRNet, for fast intermediate frame synthesizing. It first extracts pyramid features from given inputs, and then refines the bilateral intermediate flow fields together with a powerful intermediate feature until generating the desired output. The gradually refined intermediate feature can not only facilitate intermediate flow estimation, but also compensate for contextual details, making IFRNet do not need additional synthesis or refinement module. To fully release its potential, we further propose a novel task-oriented optical flow distillation loss to focus on learning the useful teacher knowledge towards frame synthesizing. Meanwhile, a new geometry consistency regularization term is imposed on the gradually refined intermediate features to keep better structure layout. Experiments on various benchmarks demonstrate the excellent performance and fast inference speed of proposed approaches. Code is available at https://github.com/ltkong218/IFRNet.

PDF Abstract CVPR 2022 PDF CVPR 2022 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Video Frame Interpolation Middlebury IFRNet Interpolation Error 4.216 # 1
Video Frame Interpolation MSU Video Frame Interpolation IFRNet_large PSNR 28.04 # 8
SSIM 0.921 # 7
VMAF 66.98 # 11
LPIPS 0.037 # 8
MS-SSIM 0.943 # 10
Video Frame Interpolation MSU Video Frame Interpolation IFRNet_small PSNR 27.45 # 13
SSIM 0.908 # 15
VMAF 63.43 # 15
LPIPS 0.049 # 13
MS-SSIM 0.931 # 15
Video Frame Interpolation MSU Video Frame Interpolation IFRNet_base PSNR 27.67 # 12
SSIM 0.909 # 14
VMAF 64.16 # 13
LPIPS 0.048 # 12
MS-SSIM 0.932 # 14
Video Frame Interpolation UCF101 IFRNet PSNR 35.42 # 5
SSIM 0.9698 # 8
Video Frame Interpolation Vimeo90K IFRNet PSNR 36.20 # 5
SSIM 0.9808 # 5
Speed (ms/f) 16 (Tesla V100) # 1

Methods