3D Human Motion Estimation via Motion Compression and Refinement

9 Aug 2020  ·  Zhengyi Luo, S. Alireza Golestaneh, Kris M. Kitani ·

We develop a technique for generating smooth and accurate 3D human pose and motion estimates from RGB video sequences. Our method, which we call Motion Estimation via Variational Autoencoder (MEVA), decomposes a temporal sequence of human motion into a smooth motion representation using auto-encoder-based motion compression and a residual representation learned through motion refinement. This two-step encoding of human motion captures human motion in two stages: a general human motion estimation step that captures the coarse overall motion, and a residual estimation that adds back person-specific motion details. Experiments show that our method produces both smooth and accurate 3D human pose and motion estimates.

PDF Abstract

Results from the Paper


Ranked #14 on 3D Human Pose Estimation on 3DPW (Acceleration Error metric, using extra training data)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
3D Human Pose Estimation 3DPW MEVA PA-MPJPE 54.7 # 79
MPJPE 86.9 # 82
Acceleration Error 11.6 # 14
3D Human Pose Estimation Human3.6M MEVA Average MPJPE (mm) 76 # 288
PA-MPJPE 53.2 # 99
Acceleration Error 15.3 # 15
3D Human Pose Estimation MPI-INF-3DHP MEVA MPJPE 96.4 # 57
PA-MPJPE 65.4 # 18
Acceleration Error 11.1 # 9

Methods