TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
3D Human Pose Estimation	Human3.6M	Occlusion-Aware Networks	Average MPJPE (mm)	42.9	# 86
3D Human Pose Estimation	Human3.6M	Occlusion-Aware Networks	Using 2D ground-truth joints	No	# 2
3D Human Pose Estimation	Human3.6M	Occlusion-Aware Networks	Multi-View or Monocular	Monocular	# 1
3D Human Pose Estimation	HumanEva-I	Occlusion-Aware Networks	Mean Reconstruction Error (mm)	14.3	# 4

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/occlusion-aware-networks-for-3d-human-pose/3d-human-pose-estimation-on-humaneva-i)](https://paperswithcode.com/sota/3d-human-pose-estimation-on-humaneva-i?p=occlusion-aware-networks-for-3d-human-pose)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/occlusion-aware-networks-for-3d-human-pose/3d-human-pose-estimation-on-human36m)](https://paperswithcode.com/sota/3d-human-pose-estimation-on-human36m?p=occlusion-aware-networks-for-3d-human-pose)`

Occlusion-Aware Networks for 3D Human Pose Estimation in Video

ICCV 2019 · Yu Cheng, Bo Yang, Bo Wang, Wending Yan, Robby T. Tan ·

Occlusion is a key problem in 3D human pose estimation from a monocular video. To address this problem, we introduce an occlusion-aware deep-learning framework. By employing estimated 2D confidence heatmaps of keypoints and an optical-flow consistency constraint, we filter out the unreliable estimations of occluded keypoints. When occlusion occurs, we have incomplete 2D keypoints and feed them to our 2D and 3D temporal convolutional networks (2D and 3D TCNs) that enforce temporal smoothness to produce a complete 3D pose. By using incomplete 2D keypoints, instead of complete but incorrect ones, our networks are less affected by the error-prone estimations of occluded keypoints. Training the occlusion-aware 3D TCN requires pairs of a 3D pose and a 2D pose with occlusion labels. As no such a dataset is available, we introduce a "Cylinder Man Model" to approximate the occupation of body parts in 3D space. By projecting the model onto a 2D plane in different viewing angles, we obtain and label the occluded keypoints, providing us plenty of training data. In addition, we use this model to create a pose regularization constraint, preferring the 2D estimations of unreliable keypoints to be occluded. Our method outperforms state-of-the-art methods on Human 3.6M and HumanEva-I datasets.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

3D Human Pose Estimation

Monocular 3D Human Pose Estimation

Optical Flow Estimation

Pose Estimation

Datasets

Human3.6M

Results from the Paper

Add Remove

Ranked #4 on 3D Human Pose Estimation on HumanEva-I

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
3D Human Pose Estimation	Human3.6M	Occlusion-Aware Networks	Average MPJPE (mm)	42.9	# 86	Compare
			Using 2D ground-truth joints	No	# 2	Compare
			Multi-View or Monocular	Monocular	# 1	Compare
3D Human Pose Estimation	HumanEva-I	Occlusion-Aware Networks	Mean Reconstruction Error (mm)	14.3	# 4	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Occlusion-Aware Networks for 3D Human Pose Estimation in Video

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove