RT-GENE: Real-Time Eye Gaze Estimation in Natural Environments

In this work, we consider the problem of robust gaze estimation in natural environments. Large camera-to-subject distances and high variations in head pose and eye gaze angles are common in such environments. This leads to two main shortfalls in state-of-the-art methods for gaze estimation: hindered ground truth gaze annotation and diminished gaze estimation accuracy as image resolution decreases with distance. We first record a novel dataset of varied gaze and head pose images in a natural environment, addressing the issue of ground truth annotation by measuring head pose using a motion capture system and eye gaze using mobile eyetracking glasses. We apply semantic image inpainting to the area covered by the glasses to bridge the gap between training and testing images by removing the obtrusiveness of the glasses. We also present a new real-time algorithm involving appearance-based deep convolutional neural networks with increased capacity to cope with the diverse images in the new dataset. Experiments with this network architecture are conducted on a number of diverse eye-gaze datasets including our own, and in cross dataset evaluations. We demonstrate state-of-the-art performance in terms of estimation accuracy in all experiments, and the architecture performs well even on lower resolution images.

PDF Abstract

Datasets


Introduced in the Paper:

RT-GENE

Used in the Paper:

MPIIGaze GazeCapture

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Gaze Estimation MPII Gaze RT-GENE 2 model ensemble Angular Error 4.6 # 4
Gaze Estimation MPII Gaze RT-GENE 4 model ensemble Angular Error 4.3 # 3
Gaze Estimation MPII Gaze RT-GENE single model Angular Error 4.8 # 5
Gaze Estimation RT-GENE RT-GENE 4 model ensemble Angular Error 7.7 # 1
Gaze Estimation UT Multi-view RT-GENE 4 model ensemble Angular Error 5.1 # 1

Methods


No methods listed for this paper. Add relevant methods here