Towards unconstrained joint hand-object reconstruction from RGB videos

16 Aug 2021  ·  Yana Hasson, Gül Varol, Ivan Laptev, Cordelia Schmid ·

Our work aims to obtain 3D reconstruction of hands and manipulated objects from monocular videos. Reconstructing hand-object manipulations holds a great potential for robotics and learning from human demonstrations. The supervised learning approach to this problem, however, requires 3D supervision and remains limited to constrained laboratory settings and simulators for which 3D ground truth is available. In this paper we first propose a learning-free fitting approach for hand-object reconstruction which can seamlessly handle two-hand object interactions. Our method relies on cues obtained with common methods for object detection, hand pose estimation and instance segmentation. We quantitatively evaluate our approach and show that it can be applied to datasets with varying levels of difficulty for which training data is unavailable.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
hand-object pose DexYCB UHO Average MPJPE (mm) 18.8 # 8
Procrustes-Aligned MPJPE - # 4
OCE - # 6
MCE 52.5 # 4
ADD-S - # 4
hand-object pose HO-3D HOR Average MPJPE (mm) - # 6
ST-MPJPE 26.8 # 5
PA-MPJPE 12.0 # 9
OME 80.0 # 6
ADD-S 40.0 # 6
3D Hand Pose Estimation HO-3D HOR Average MPJPE (mm) - # 10
ST-MPJPE (mm) 26.8 # 11
PA-MPJPE (mm) 12.0 # 13

Methods


No methods listed for this paper. Add relevant methods here