Keypoint Detection
150 papers with code • 7 benchmarks • 11 datasets
Keypoint Detection involves simultaneously detecting people and localizing their keypoints. Keypoints are the same thing as interest points. They are spatial locations, or points in the image that define what is interesting or what stand out in the image. They are invariant to image rotation, shrinkage, translation, distortion, and so on.
( Image credit: PifPaf: Composite Fields for Human Pose Estimation; "Learning to surf" by fotologic, license: CC-BY-2.0 )
Libraries
Use these libraries to find Keypoint Detection models and implementationsDatasets
Most implemented papers
Rotate to Attend: Convolutional Triplet Attention Module
In this paper, we investigate light-weight but effective attention mechanisms and present triplet attention, a novel method for computing attention weights by capturing cross-dimension interaction using a three-branch structure.
ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
In this paper, we show the surprisingly good capabilities of plain vision transformers for pose estimation from various aspects, namely simplicity in model structure, scalability in model size, flexibility in training paradigm, and transferability of knowledge between models, through a simple baseline model called ViTPose.
Data Distillation: Towards Omni-Supervised Learning
We investigate omni-supervised learning, a special regime of semi-supervised learning in which the learner exploits all available labeled data plus internet-scale sources of unlabeled data.
MultiPoseNet: Fast Multi-Person Pose Estimation using Pose Residual Network
In this paper, we present MultiPoseNet, a novel bottom-up multi-person pose estimation architecture that combines a multi-task model with a novel assignment method.
Slimmable Neural Networks
Instead of training individual networks with different width configurations, we train a shared network with switchable batch normalization.
Learning Delicate Local Representations for Multi-Person Pose Estimation
To tackle this problem, we propose an efficient attention mechanism - Pose Refine Machine (PRM) to make a trade-off between local and global representations in output features and further refine the keypoint locations.
Deep Alignment Network: A convolutional neural network for robust face alignment
Our method uses entire face images at all stages, contrary to the recently proposed face alignment methods that rely on local patches.
AI Challenger : A Large-scale Dataset for Going Deeper in Image Understanding
Significant progress has been achieved in Computer Vision by leveraging large-scale image datasets.
PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model
We present a box-free bottom-up approach for the tasks of pose estimation and instance segmentation of people in multi-person images using an efficient single-shot model.
CrowdPose: Efficient Crowded Scenes Pose Estimation and A New Benchmark
In this paper, we propose a novel and efficient method to tackle the problem of pose estimation in the crowd and a new dataset to better evaluate algorithms.