TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	REMOVE
3D Human Pose Estimation	HumanEva-I	DMHSR(J,B,D)	Mean Reconstruction Error (mm)	33.7	# 21

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/deep-multitask-architecture-for-integrated-2d/3d-human-pose-estimation-on-humaneva-i)](https://paperswithcode.com/sota/3d-human-pose-estimation-on-humaneva-i?p=deep-multitask-architecture-for-integrated-2d)`

Deep Multitask Architecture for Integrated 2D and 3D Human Sensing

CVPR 2017 · Alin-Ionut Popa, Mihai Zanfir, Cristian Sminchisescu ·

We propose a deep multitask architecture for \emph{fully automatic 2d and 3d human sensing} (DMHS), including \emph{recognition and reconstruction}, in \emph{monocular images}. The system computes the figure-ground segmentation, semantically identifies the human body parts at pixel level, and estimates the 2d and 3d pose of the person. The model supports the joint training of all components by means of multi-task losses where early processing stages recursively feed into advanced ones for increasingly complex calculations, accuracy and robustness. The design allows us to tie a complete training protocol, by taking advantage of multiple datasets that would otherwise restrictively cover only some of the model components: complex 2d image data with no body part labeling and without associated 3d ground truth, or complex 3d data with limited 2d background variability. In detailed experiments based on several challenging 2d and 3d datasets (LSP, HumanEva, Human3.6M), we evaluate the sub-structures of the model, the effect of various types of training data in the multitask loss, and demonstrate that state-of-the-art results can be achieved at all processing levels. We also show that in the wild our monocular RGB architecture is perceptually competitive to a state-of-the art (commercial) Kinect system based on RGB-D data.