Image: Zimmerman et al
|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
To construct FrankMocap, we build the state-of-the-art monocular 3D "hand" motion capture method by taking the hand part of the whole body parametric model (SMPL-X).
Low-cost consumer depth cameras and deep learning have enabled reasonable 3D hand pose estimation from single depth images.
This work addresses a novel and challenging problem of estimating the full 3D hand shape and pose from a single RGB image.
Most of the previous image-based 3D human pose and mesh estimation methods estimate parameters of the human mesh model from an input image.
Ranked #1 on 3D Hand Pose Estimation on FreiHAND
Official Torch7 implementation of "V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map", CVPR 2018
Ranked #4 on Hand Pose Estimation on HANDS 2017
To overcome these weaknesses, we firstly cast the 3D hand and human pose estimation problem from a single depth map into a voxel-to-voxel prediction that uses a 3D voxelized grid and estimates the per-voxel likelihood for each keypoint.
Ranked #1 on Pose Estimation on ITOP front-view
To understand how people look, interact, or perform tasks, we need to quickly and accurately capture their 3D body, face, and hands together from an RGB image.
Our dataset and experiments can be of interest to communities of 3D hand pose estimation, 6D object pose, and robotics as well as action recognition.