TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Monocular 3D Human Pose Estimation	Human3.6M	Monocular Total Capture	Average MPJPE (mm)	58.3	# 28
Monocular 3D Human Pose Estimation	Human3.6M	Monocular Total Capture	Use Video Sequence	NO	# 1
Monocular 3D Human Pose Estimation	Human3.6M	Monocular Total Capture	Frames Needed	1	# 1
Monocular 3D Human Pose Estimation	Human3.6M	Monocular Total Capture	Need Ground Truth 2D Pose	No	# 1
3D Human Pose Estimation	Human3.6M	Monocular Total Capture	Average MPJPE (mm)	58.3	# 238

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/monocular-total-capture-posing-face-body-and/monocular-3d-human-pose-estimation-on-human3)](https://paperswithcode.com/sota/monocular-3d-human-pose-estimation-on-human3?p=monocular-total-capture-posing-face-body-and)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/monocular-total-capture-posing-face-body-and/3d-human-pose-estimation-on-human36m)](https://paperswithcode.com/sota/3d-human-pose-estimation-on-human36m?p=monocular-total-capture-posing-face-body-and)`

Monocular Total Capture: Posing Face, Body, and Hands in the Wild

CVPR 2019 · Donglai Xiang, Hanbyul Joo, Yaser Sheikh ·

We present the first method to capture the 3D total motion of a target person from a monocular view input. Given an image or a monocular video, our method reconstructs the motion from body, face, and fingers represented by a 3D deformable mesh model. We use an efficient representation called 3D Part Orientation Fields (POFs), to encode the 3D orientations of all body parts in the common 2D image space. POFs are predicted by a Fully Convolutional Network (FCN), along with the joint confidence maps. To train our network, we collect a new 3D human motion dataset capturing diverse total body motion of 40 subjects in a multiview system. We leverage a 3D deformable human model to reconstruct total body pose from the CNN outputs by exploiting the pose and shape prior in the model. We also present a texture-based tracking method to obtain temporally coherent motion capture output. We perform thorough quantitative evaluations including comparison with the existing body-specific and hand-specific methods, and performance analysis on camera viewpoint and human pose changes. Finally, we demonstrate the results of our total body motion capture on various challenging in-the-wild videos. Our code and newly collected human motion dataset will be publicly shared.

PDF Abstract CVPR 2019 PDF CVPR 2019 Abstract

Code

Add Remove Mark official

CMU-Perceptual-Computing-Lab/Monocu…

652

Tasks

Add Remove

3D Human Pose Estimation

Hand Pose Estimation

Monocular 3D Human Pose Estimation

Datasets

MS COCO

Human3.6M

PosePrior

Results from the Paper

Edit

Ranked #28 on Monocular 3D Human Pose Estimation on Human3.6M

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Monocular 3D Human Pose Estimation	Human3.6M	Monocular Total Capture	Average MPJPE (mm)	58.3	# 28	Compare
			Use Video Sequence	NO	# 1	Compare
			Frames Needed	1	# 1	Compare
			Need Ground Truth 2D Pose	No	# 1	Compare
3D Human Pose Estimation	Human3.6M	Monocular Total Capture	Average MPJPE (mm)	58.3	# 238	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Monocular Total Capture: Posing Face, Body, and Hands in the Wild

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove