Pose Tracking
60 papers with code • 3 benchmarks • 9 datasets
Pose Tracking is the task of estimating multi-person human poses in videos and assigning unique instance IDs for each keypoint across frames. Accurate estimation of human keypoint-trajectories is useful for human action recognition, human interaction understanding, motion capture and animation.
Source: LightTrack: A Generic Framework for Online Top-Down Human Pose Tracking
Libraries
Use these libraries to find Pose Tracking models and implementationsDatasets
Latest papers
VideoMAC: Video Masked Autoencoders Meet ConvNets
In this paper, we propose a new approach termed as \textbf{VideoMAC}, which combines video masked autoencoders with resource-friendly ConvNets.
Towards Real-World Aerial Vision Guidance with Categorical 6D Pose Tracker
In this paper, we investigate the real-world robot task of aerial vision guidance for aerial robotics manipulation, utilizing category-level 6-DoF pose tracking.
PACE: Pose Annotations in Cluttered Environments
Addressing this, we introduce PACE (Pose Annotations in Cluttered Environments), a large-scale benchmark designed to advance the development and evaluation of pose estimation methods in cluttered scenarios.
EVI-SAM: Robust, Real-time, Tightly-coupled Event-Visual-Inertial State Estimation and 3D Dense Mapping
To the best of our knowledge, this is the first non-learning work to realize event-based dense mapping.
Deep Event Visual Odometry
To remove the dependency on additional sensors and to push the limits of using only a single event camera, we present Deep Event VO (DEVO), the first monocular event-only system with strong performance on a large number of real-world benchmarks.
FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects
We present FoundationPose, a unified foundation model for 6D object pose estimation and tracking, supporting both model-based and model-free setups.
Pose2Gait: Extracting Gait Features from Monocular Video of Individuals with Dementia
In this work we train a deep neural network to map from a two dimensional pose sequence, extracted from a video of an individual walking down a hallway toward a wall-mounted camera, to a set of three-dimensional spatiotemporal gait features averaged over the walking sequence.
Humans in 4D: Reconstructing and Tracking Humans with Transformers
To analyze video, we use 3D reconstructions from HMR 2. 0 as input to a tracking system that operates in 3D.
Multimodal video and IMU kinematic dataset on daily life activities using affordable devices (VIDIMU)
Human activity recognition and clinical biomechanics are challenging problems in physical telerehabilitation medicine.
BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects
We present a near real-time method for 6-DoF tracking of an unknown object from a monocular RGBD video sequence, while simultaneously performing neural 3D reconstruction of the object.