Dual-Stream Fusion Network for Spatiotemporal Video Super-Resolution

Winter Conference on Applications of Computer Vision (WACV) 2021 · Min-Yuan Tseng, Yen-Chung Chen, Yi-Lun Lee, Wei-Sheng Lai, Yi-Hsuan Tsai, Wei-Chen Chiu ·

Visual data upsampling has been an important research topic for improving the perceptual quality and benefiting various computer vision applications. In recent years, we have witnessed remarkable progresses brought by the re-naissance of deep learning techniques for video or image super-resolution. However, most existing methods focus on advancing super-resolution at either spatial or temporal direction, i.e, to increase the spatial resolution or the video frame rate. In this paper, we instead turn to discuss both directions jointly and tackle the spatiotemporal upsampling problem. Our method is based on an important observation that: even the direct cascade of prior research in spatial and temporal super-resolution can achieve the spatiotemporal upsampling, changing orders for combining them would lead to results with a complementary property. Thus, we propose a dual-stream fusion network to adaptively fuse the intermediate results produced by two spatiotemporal up-sampling streams, where the first stream applies the spatial super-resolution followed by the temporal super-resolution, while the second one is with the reverse order of cascade. Extensive experiments verify the efficacy of the proposed method against several baselines. Moreover, we investigate various spatial and temporal upsampling methods as the basis in our two-stream model and demonstrate the flexibility with wide applicability of the proposed framework.

PDF