3D human pose estimation in video with temporal convolutions and semi-supervised training

CVPR 2019 Dario PavlloChristoph FeichtenhoferDavid GrangierMichael Auli

In this work, we demonstrate that 3D poses in video can be effectively estimated with a fully convolutional model based on dilated temporal convolutions over 2D keypoints. We also introduce back-projection, a simple and effective semi-supervised training method that leverages unlabeled video data... (read more)

PDF Abstract
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT BENCHMARK
3D Human Pose Estimation Human3.6M Temporal convolution + semi-supervision Average MPJPE (mm) 46.8 # 19
Using 2D ground-truth joints No # 1
Multi-View or Monocular Monocular # 1

Methods used in the Paper


METHOD TYPE
🤖 No Methods Found Help the community by adding them if they're not listed; e.g. Deep Residual Learning for Image Recognition uses ResNet