Representation learning of vertex heatmaps for 3D human mesh reconstruction from multi-view images

29 Jun 2023  ·  Sungho Chun, Sungbum Park, Ju Yong Chang ·

This study addresses the problem of 3D human mesh reconstruction from multi-view images. Recently, approaches that directly estimate the skinned multi-person linear model (SMPL)-based human mesh vertices based on volumetric heatmap representation from input images have shown good performance. We show that representation learning of vertex heatmaps using an autoencoder helps improve the performance of such approaches. Vertex heatmap autoencoder (VHA) learns the manifold of plausible human meshes in the form of latent codes using AMASS, which is a large-scale motion capture dataset. Body code predictor (BCP) utilizes the learned body prior from VHA for human mesh reconstruction from multi-view images through latent code-based supervision and transfer of pretrained weights. According to experiments on Human3.6M and LightStage datasets, the proposed method outperforms previous methods and achieves state-of-the-art human mesh reconstruction performance.

PDF Abstract

Datasets


Results from the Paper


 Ranked #1 on 3D Human Pose Estimation on Human3.6M (using extra training data)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
3D Human Pose Estimation Human3.6M BCP+VHA R152 384x384 Average MPJPE (mm) 15.6 # 1
Using 2D ground-truth joints No # 2
Multi-View or Monocular Multi-View # 1
MPVE (mm) 22.3 # 3
Angular Error 10.87 # 1

Methods