Video Salient Object Detection

20 papers with code • 10 benchmarks • 4 datasets

Video salient object detection (VSOD) is significantly essential for understanding the underlying mechanism behind HVS during free-viewing in general and instrumental to a wide range of real-world applications, e.g., video segmentation, video captioning, video compression, autonomous driving, robotic interaction, weakly supervised attention. Besides its academic value and practical significance, VSOD presents great difficulties due to the challenges carried by video data (diverse motion patterns, occlusions, blur, large object deformations, etc.) and the inherent complexity of human visual attention behavior (i.e., selective attention allocation, attention shift) during dynamic scenes. Online benchmark: http://dpfan.net/davsod.

( Image credit: Shifting More Attention to Video Salient Object Detection, CVPR2019-Best Paper Finalist )

Latest papers with no code

Reframe Anything: LLM Agent for Open World Video Reframing

no code yet • 10 Mar 2024

The proliferation of mobile devices and social media has revolutionized content dissemination, with short-form video becoming increasingly prevalent.

SimulFlow: Simultaneously Extracting Feature and Identifying Target for Unsupervised Video Object Segmentation

no code yet • 30 Nov 2023

We evaluate our method on several benchmark datasets and achieve state-of-the-art results.

A Spatial-Temporal Dual-Mode Mixed Flow Network for Panoramic Video Salient Object Detection

no code yet • 13 Oct 2023

First, the ILA module calculates the attention between adjacent level features of consecutive frames of panoramic video to improve the accuracy of extracting salient object features from the spatial flow.

UniST: Towards Unifying Saliency Transformer for Video Saliency Prediction and Detection

no code yet • 15 Sep 2023

While many approaches have crafted task-specific training paradigms for either video saliency prediction or video salient object detection tasks, few attention has been devoted to devising a generalized saliency modeling framework that seamlessly bridges both these distinct tasks.

Panoramic Video Salient Object Detection with Ambisonic Audio Guidance

no code yet • 26 Nov 2022

In this paper, we aim to tackle the video salient object detection problem for panoramic videos, with their corresponding ambisonic audios.

PSNet: Parallel Symmetric Network for Video Salient Object Detection

no code yet • 12 Oct 2022

Finally, we use the Importance Perception Fusion (IPF) module to fuse the features from two parallel branches according to their different importance in different scenarios.

Weakly Supervised Video Salient Object Detection via Point Supervision

no code yet • 15 Jul 2022

Several works attempt to use scribble annotations to mitigate this problem, but point supervision as a more labor-saving annotation method (even the most labor-saving method among manual annotation methods for dense prediction), has not been explored.

A Novel Long-term Iterative Mining Scheme for Video Salient Object Detection

no code yet • 20 Jun 2022

The existing state-of-the-art (SOTA) video salient object detection (VSOD) models have widely followed short-term methodology, which dynamically determines the balance between spatial and temporal saliency fusion by solely considering the current consecutive limited frames.

Video Salient Object Detection via Contrastive Features and Attention Modules

no code yet • 3 Nov 2021

Video salient object detection aims to find the most visually distinctive objects in a video.

Guidance and Teaching Network for Video Salient Object Detection

no code yet • 21 May 2021

Owing to the difficulties of mining spatial-temporal cues, the existing approaches for video salient object detection (VSOD) are limited in understanding complex and noisy scenarios, and often fail in inferring prominent objects.