Epipolar Transformers

CVPR 2020 Yihui HeRui YanKaterina FragkiadakiShoou-I Yu

A common approach to localize 3D human joints in a synchronized and calibrated multi-view setup consists of two-steps: (1) apply a 2D detector separately on each view to localize joints in 2D, and (2) perform robust triangulation on 2D detections from each view to acquire the 3D joint locations. However, in step 1, the 2D detector is limited to solving challenging cases which could potentially be better resolved in 3D, such as occlusions and oblique viewing angles, purely in 2D without leveraging any 3D information... (read more)

PDF Abstract CVPR 2020 PDF CVPR 2020 Abstract

Results from the Paper


Ranked #2 on 3D Human Pose Estimation on Human3.6M (using extra training data)

     Get a GitHub badge
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK USES EXTRA
TRAINING DATA
BENCHMARK
3D Human Pose Estimation Human3.6M Epipolar Transformer+R50 256×256+RPSM Average MPJPE (mm) 26.9 # 4
Using 2D ground-truth joints No # 1
Multi-View or Monocular Multi-View # 1
3D Human Pose Estimation Human3.6M Epipolar Transformer+R152 384x384 Average MPJPE (mm) 19.0 # 2
Using 2D ground-truth joints No # 1
Multi-View or Monocular Multi-View # 1

Methods used in the Paper