We present a novel approach for unsupervised learning of depth and ego-motion from monocular video.
We present an approach which takes advantage of both structure and semantics for unsupervised monocular learning of depth and ego-motion.
The ability to predict depth from a single image - using recent advances in CNNs - is of increasing interest to the vision community.
To the best of our knowledge, this is the first work to show that deep networks trained using unlabelled monocular videos can predict globally scale-consistent camera trajectories over a long video sequence.
Ranked #12 on Monocular Depth Estimation on KITTI Eigen split
Despite learning based methods showing promising results in single view depth estimation and visual odometry, most existing approaches treat the tasks in a supervised manner.