Keypoint Detection

148 papers with code • 7 benchmarks • 11 datasets

Keypoint Detection involves simultaneously detecting people and localizing their keypoints. Keypoints are the same thing as interest points. They are spatial locations, or points in the image that define what is interesting or what stand out in the image. They are invariant to image rotation, shrinkage, translation, distortion, and so on.

( Image credit: PifPaf: Composite Fields for Human Pose Estimation; "Learning to surf" by fotologic, license: CC-BY-2.0 )

Libraries

Use these libraries to find Keypoint Detection models and implementations

Most implemented papers

ArtTrack: Articulated Multi-person Tracking in the Wild

eldar/pose-tensorflow CVPR 2017

In this paper we propose an approach for articulated tracking of multiple people in unconstrained videos.

Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation

hellojialee/Improved-Body-Parts 24 Nov 2019

We rethink a well-know bottom-up approach for multi-person pose estimation and propose an improved one.

Rethinking on Multi-Stage Networks for Human Pose Estimation

megvii-detection/MSPN 1 Jan 2019

Existing pose estimation approaches fall into two categories: single-stage and multi-stage methods.

Pose2Seg: Detection Free Human Instance Segmentation

liruilong940607/Pose2Seg CVPR 2019

We demonstrate that our pose-based framework can achieve better accuracy than the state-of-art detection-based approach on the human instance segmentation problem, and can moreover better handle occlusion.

Distribution-Aware Coordinate Representation for Human Pose Estimation

leoxiaobin/deep-high-resolution-net.pytorch CVPR 2020

Interestingly, we found that the process of decoding the predicted heatmaps into the final joint coordinates in the original image space is surprisingly significant for human pose estimation performance, which nevertheless was not recognised before.

OpenPifPaf: Composite Fields for Semantic Keypoint Detection and Spatio-Temporal Association

vita-epfl/openpifpaf 3 Mar 2021

We present a generic neural network architecture that uses Composite Fields to detect and construct a spatio-temporal pose which is a single, connected graph whose nodes are the semantic keypoints (e. g., a person's body joints) in multiple frames.

Polarized Self-Attention: Towards High-quality Pixel-wise Regression

DeLightCMU/PSA arXiv preprint 2021

Pixel-wise regression is probably the most common problem in fine-grained computer vision tasks, such as estimating keypoint heatmaps and segmentation masks.

Associative Embedding: End-to-End Learning for Joint Detection and Grouping

open-mmlab/mmpose NeurIPS 2017

We introduce associative embedding, a novel method for supervising convolutional neural networks for the task of detection and grouping.

Cascaded Pyramid Network for Multi-Person Pose Estimation

chenyilun95/tf-cpn CVPR 2018

In this paper, we present a novel network structure called Cascaded Pyramid Network (CPN) which targets to relieve the problem from these "hard" keypoints.

Dynamic Convolution: Attention over Convolution Kernels

xmu-xiaoma666/External-Attention-pytorch CVPR 2020

Light-weight convolutional neural networks (CNNs) suffer performance degradation as their low computational budgets constrain both the depth (number of convolution layers) and the width (number of channels) of CNNs, resulting in limited representation capability.