Gaze Estimation is a task to predict where a person is looking at given the person’s full face. The task contains two directions: 3-D gaze vector and 2-D gaze position estimation. 3-D gaze vector estimation is to predict the gaze vector, which is usually used in the automotive safety. 2-D gaze position estimation is to predict the horizontal and vertical coordinates on a 2-D screen, which allows utilizing gaze point to control a cursor for human-machine interaction.
|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
Tracking both the head and eyes (pupils) can provide reliable estimation of a driver's gaze using face images under ideal conditions.
To improve estimation accuracy, we propose to use dilated-convolutions in a deep convolutional neural network to capture subtle changes in the eye images, and a novel gaze decomposition method that decomposes the gaze angle into the sum of a subject-independent gaze estimate from the image and a subject-dependent bias.
Although automatic gaze estimation is very important to a large variety of application areas, it is difficult to train accurate and robust gaze models, in great part due to the difficulty in collecting large and diverse data (annotating 3D gaze is expensive and existing datasets use different setups).
We further incorporate our proposed RT-BENE baselines in the recently presented RT-GENE gaze estimation framework where it provides a real-time inference of the openness of the eyes.
Ranked #1 on Blink estimation on RT-BENE
Finally, we demonstrate an application of our model for estimating customer attention in a supermarket setting.
For the second issue, we define a new metric to measure the robustness of gaze estimator, and propose an adversarial training based Disturbance with Ordinal loss (DwO) method to improve it.
Accurate eye segmentation can improve eye-gaze estimation and support interactive computing based on visual attention; however, existing eye segmentation methods suffer from issues such as person-dependent accuracy, lack of robustness, and an inability to be run in real-time.
nature of this data suggests better estimation may be possible if the model explicitly made use of such "repeated measurements" from each user as is commonly done in classical statistical analysis using so-called mixed effects models.