Saliency Prediction
88 papers with code • 3 benchmarks • 7 datasets
A saliency map is a model that predicts eye fixations on a visual scene. Saliency prediction is informed by the human visual attention mechanism and predicts the possibility of the human eyes to stay in a certain position in the scene.
Libraries
Use these libraries to find Saliency Prediction models and implementationsLatest papers
Spatio-Temporal Self-Attention Network for Video Saliency Prediction
3D convolutional neural networks have achieved promising results for video tasks in computer vision, including video saliency prediction that is explored in this paper.
Specificity-preserving RGB-D Saliency Detection
To effectively fuse cross-modal features in the shared learning network, we propose a cross-enhanced integration module (CIM) and then propagate the fused feature to the next layer for integrating cross-level information.
Energy-Based Generative Cooperative Saliency Prediction
In this paper, to model the uncertainty of visual saliency, we study the saliency prediction problem from the perspective of generative models by learning a conditional probability distribution over the saliency map given an input image, and treating the saliency prediction as a sampling process from the learned distribution.
DeepGaze IIE: Calibrated prediction in and out-of-domain for state-of-the-art saliency modeling
Since 2014 transfer learning has become the key driver for the improvement of spatial saliency prediction; however, with stagnant progress in the last 3-5 years.
Generative Transformer for Accurate and Reliable Salient Object Detection
For the former, we apply transformer to a deterministic model, and explain that the effective structure modeling and global context modeling abilities lead to its superior performance compared with the CNN based frameworks.
Noise-Aware Video Saliency Prediction
We note that the accuracy of the maps reconstructed from the gaze data of a fixed number of observers varies with the frame, as it depends on the content of the scene.
Modeling Object Dissimilarity for Deep Saliency Prediction
Saliency prediction has made great strides over the past two decades, with current techniques modeling low-level information, such as color, intensity and size contrasts, and high-level ones, such as attention and gaze direction for entire objects.
BTS-Net: Bi-directional Transfer-and-Selection Network For RGB-D Salient Object Detection
Depth information has been proved beneficial in RGB-D salient object detection (SOD).
Learning to Predict Salient Faces: A Novel Visual-Audio Saliency Model
Inspired by the findings of our investigation, we propose a novel multi-modal video saliency model consisting of three branches: visual, audio and face.
Rethinking 360deg Image Visual Attention Modelling With Unsupervised Learning.
This performance is achieved using an encoder that is trained in a completely unsupervised way and a relatively lightweight supervised decoder (3. 8 X fewer parameters in the case of the ResNet50 encoder).