Pose Estimation
1351 papers with code • 28 benchmarks • 114 datasets
Pose Estimation is a computer vision task where the goal is to detect the position and orientation of a person or an object. Usually, this is done by predicting the location of specific keypoints like hands, head, elbows, etc. in case of Human Pose Estimation.
A common benchmark for this task is MPII Human Pose
( Image credit: Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose )
Libraries
Use these libraries to find Pose Estimation models and implementationsSubtasks
- 3D Human Pose Estimation
- Keypoint Detection
- 3D Pose Estimation
- 6D Pose Estimation
- 6D Pose Estimation
- Hand Pose Estimation
- 6D Pose Estimation using RGB
- Multi-Person Pose Estimation
- Head Pose Estimation
- Human Pose Forecasting
- Animal Pose Estimation
- 6D Pose Estimation using RGBD
- Vehicle Pose Estimation
- RF-based Pose Estimation
- Car Pose Estimation
- Hand Joint Reconstruction
- Activeness Detection
- Semi-supervised 2D and 3D landmark labeling
Latest papers
DeDoDe v2: Analyzing and Improving the DeDoDe Keypoint Detector
First, we find that DeDoDe keypoints tend to cluster together, which we fix by performing non-max suppression on the target distribution of the detector during training.
EventEgo3D: 3D Human Motion Capture from Egocentric Event Streams
In response to the existing limitations, this paper 1) introduces a new problem, i. e., 3D human motion capture from an egocentric monocular event camera with a fisheye lens, and 2) proposes the first approach to it called EventEgo3D (EE3D).
DepthMOT: Depth Cues Lead to a Strong Multi-Object Tracker
Inspired by this, even though the bounding boxes of objects are close on the camera plane, we can differentiate them in the depth dimension, thereby establishing a 3D perception of the objects.
HandDiff: 3D Hand Pose Estimation with Diffusion on Image-Point Cloud
Extracting keypoint locations from input hand frames, known as 3D hand pose estimation, is a critical task in various human-computer interaction applications.
SDPose: Tokenized Pose Estimation via Circulation-Guide Self-Distillation
To mitigate the problem of under-fitting, we design a transformer module named Multi-Cycled Transformer(MCT) based on multiple-cycled forwards to more fully exploit the potential of small model parameters.
Semi-Supervised Unconstrained Head Pose Estimation in the Wild
Existing head pose estimation datasets are either composed of numerous samples by non-realistic synthesis or lab collection, or limited images by labor-intensive annotating.
SelfPose3d: Self-Supervised Multi-Person Multi-View 3d Pose Estimation
Unlike current state-of-the-art fully-supervised methods, our approach does not require any 2d or 3d ground-truth poses and uses only the multi-view input images from a calibrated camera setup and 2d pseudo poses generated from an off-the-shelf 2d human pose estimator.
KTPFormer: Kinematics and Trajectory Prior Knowledge-Enhanced Transformer for 3D Human Pose Estimation
This paper presents a novel Kinematics and Trajectory Prior Knowledge-Enhanced Transformer (KTPFormer), which overcomes the weakness in existing transformer-based methods for 3D human pose estimation that the derivation of Q, K, V vectors in their self-attention mechanisms are all based on simple linear mapping.
Video-Based Human Pose Regression via Decoupled Space-Time Aggregation
In light of this, we propose a novel Decoupled Space-Time Aggregation network (DSTA) to separately capture the spatial contexts between adjacent joints and the temporal cues of each individual joint, thereby avoiding the conflation of spatiotemporal dimensions.
Instance-Adaptive and Geometric-Aware Keypoint Learning for Category-Level 6D Object Pose Estimation
(2) The second design is a Geometric-Aware Feature Aggregation module, which can efficiently integrate the local and global geometric information into keypoint features.