CLIFF: Carrying Location Information in Full Frames into Human Pose and Shape Estimation

1 Aug 2022  ยท  Zhihao LI, Jianzhuang Liu, Zhensong Zhang, Songcen Xu, Youliang Yan ยท

Top-down methods dominate the field of 3D human pose and shape estimation, because they are decoupled from human detection and allow researchers to focus on the core problem. However, cropping, their first step, discards the location information from the very beginning, which makes themselves unable to accurately predict the global rotation in the original camera coordinate system. To address this problem, we propose to Carry Location Information in Full Frames (CLIFF) into this task. Specifically, we feed more holistic features to CLIFF by concatenating the cropped-image feature with its bounding box information. We calculate the 2D reprojection loss with a broader view of the full frame, taking a projection process similar to that of the person projected in the image. Fed and supervised by global-location-aware information, CLIFF directly predicts the global rotation along with more accurate articulated poses. Besides, we propose a pseudo-ground-truth annotator based on CLIFF, which provides high-quality 3D annotations for in-the-wild 2D datasets and offers crucial full supervision for regression-based methods. Extensive experiments on popular benchmarks show that CLIFF outperforms prior arts by a significant margin, and reaches the first place on the AGORA leaderboard (the SMPL-Algorithms track). The code and data are available at https://github.com/huawei-noah/noah-research/tree/master/CLIFF.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
3D Human Pose Estimation 3DPW CLIFF (HR-W48) PA-MPJPE 43 # 25
MPJPE 69 # 20
MPVPE 81.2 # 17
3D Human Pose Estimation EMDB CLIFF Average MPJPE (mm) 103.134 # 3
Average MPJPE-PA (mm) 68.7969 # 4
Average MVE (mm) 122.884 # 4
Average MVE-PA (mm) 81.3275 # 3
Average MPJAE (deg) 23.0933 # 1
Average MPJAE-PA (deg) 21.6265 # 8
Jitter (10m/s^3) 55.4525 # 2
Unsupervised 3D Human Pose Estimation Human3.6M CLIFF (HR-W48) PA-MPJPE 32.7 # 1
3D Human Pose Estimation Human3.6M CLIFF (HR-W48) Average MPJPE (mm) 47.1 # 131

Methods


No methods listed for this paper. Add relevant methods here