3D Human Pose Estimation from a Single Image via Distance Matrix Regression

CVPR 2017  ·  Francesc Moreno-Noguer ·

This paper addresses the problem of 3D human pose estimation from a single image. We follow a standard two-step pipeline by first detecting the 2D position of the $N$ body joints, and then using these observations to infer 3D pose. For the first step, we use a recent CNN-based detector. For the second step, most existing approaches perform 2$N$-to-3$N$ regression of the Cartesian joint coordinates. We show that more precise pose estimates can be obtained by representing both the 2D and 3D human poses using $N\times N$ distance matrices, and formulating the problem as a 2D-to-3D distance matrix regression. For learning such a regressor we leverage on simple Neural Network architectures, which by construction, enforce positivity and symmetry of the predicted matrices. The approach has also the advantage to naturally handle missing observations and allowing to hypothesize the position of non-observed joints. Quantitative results on Humaneva and Human3.6M datasets demonstrate consistent performance gains over state-of-the-art. Qualitative evaluation on the images in-the-wild of the LSP dataset, using the regressor learned on Human3.6M, reveals very promising generalization results.

PDF Abstract CVPR 2017 PDF CVPR 2017 Abstract

Datasets


Results from the Paper


Results from Other Papers


Task Dataset Model Metric Name Metric Value Rank Source Paper Compare
3D Human Pose Estimation Human3.6M Moreno PA-MPJPE 76.5 # 112
3D Human Pose Estimation HumanEva-I EDM Mean Reconstruction Error (mm) 26.9 # 19

Methods


No methods listed for this paper. Add relevant methods here