Eye tracking research
This paper conducts a systematic study on the role of visual attention in Unsupervised Video Object Segmentation (UVOS) tasks.
In this work we aim to predict the driver's focus of attention.
The attention maps of the ophthalmologists are also collected in LAG database through a simulated eye-tracking experiment.
Video-based eye tracking is a valuable technique in various research fields.
A comparison against Wang et al. shows that our method advances the state of the art in 3D eye tracking using a single RGB camera.
Our results suggest that (1) audio is a strong contributing cue for saliency prediction, (2) salient visible sound-source is the natural cause of the superiority of our Audio-Visual model, (3) richer feature representations for the input space leads to more powerful predictions even in absence of more sophisticated saliency decoders, and (4) Audio-Visual model improves over 53. 54\% of the frames predicted by the best Visual model (our baseline).
We introduce STAViS, a spatio-temporal audiovisual saliency network that combines spatio-temporal visual and auditory information in order to efficiently address the problem of saliency estimation in videos.