Source: He et al
Deep generative architectures provide a way to model not only images but also complex, 3-dimensional objects, such as point clouds.
Most existing 3D object recognition algorithms focus on leveraging the strong discriminative power of deep learning models with softmax loss for the classification of 3D data, while learning discriminative features with deep metric learning for 3D object retrieval is more or less neglected.
We propose a novel approach to jointly perform 3D shape retrieval and pose estimation from monocular images. In order to make the method robust to real-world image variations, e. g. complex textures and backgrounds, we learn an embedding space from 3D data that only includes the relevant information, namely the shape and pose.