TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
2D Pose Estimation	MP-100	PoseAnything-S	Mean PCK@0.2 - 1shot	90.43	# 1
2D Pose Estimation	MP-100	PoseAnything-S	Mean PCK@0.2 - 5shot	93.00	# 1
2D Pose Estimation	MP-100	PoseAnything-T	Mean PCK@0.2 - 1shot	87.47	# 2
2D Pose Estimation	MP-100	PoseAnything-T	Mean PCK@0.2 - 5shot	91.12	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/pose-anything-a-graph-based-approach-for/2d-pose-estimation-on-mp-100)](https://paperswithcode.com/sota/2d-pose-estimation-on-mp-100?p=pose-anything-a-graph-based-approach-for)`

Pose Anything: A Graph-Based Approach for Category-Agnostic Pose Estimation

29 Nov 2023 · Or Hirschorn, Shai Avidan ·

Traditional 2D pose estimation models are limited by their category-specific design, making them suitable only for predefined object categories. This restriction becomes particularly challenging when dealing with novel objects due to the lack of relevant training data. To address this limitation, category-agnostic pose estimation (CAPE) was introduced. CAPE aims to enable keypoint localization for arbitrary object categories using a single model, requiring minimal support images with annotated keypoints. This approach not only enables object pose generation based on arbitrary keypoint definitions but also significantly reduces the associated costs, paving the way for versatile and adaptable pose estimation applications. We present a novel approach to CAPE that leverages the inherent geometrical relations between keypoints through a newly designed Graph Transformer Decoder. By capturing and incorporating this crucial structural information, our method enhances the accuracy of keypoint localization, marking a significant departure from conventional CAPE techniques that treat keypoints as isolated entities. We validate our approach on the MP-100 benchmark, a comprehensive dataset comprising over 20,000 images spanning more than 100 categories. Our method outperforms the prior state-of-the-art by substantial margins, achieving remarkable improvements of 2.16% and 1.82% under 1-shot and 5-shot settings, respectively. Furthermore, our method's end-to-end training demonstrates both scalability and efficiency compared to previous CAPE approaches.

PDF Abstract

Code

Add Remove Mark official

orhir/PoseAnything official

↳ Quickstart in

Spaces

267

Tasks

Add Remove

2D Pose Estimation

Animal Pose Estimation

Category-Agnostic Pose Estimation

Decoder

Keypoint Detection

Object

Pose Estimation

Vehicle Pose Estimation

Datasets

MP-100

Results from the Paper

Edit

Ranked #1 on 2D Pose Estimation on MP-100

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
2D Pose Estimation	MP-100	PoseAnything-S	Mean PCK@0.2 - 1shot	90.43	# 1	Compare
2D Pose Estimation	MP-100	PoseAnything-S	Mean PCK@0.2 - 5shot	93.00	# 1	Compare
2D Pose Estimation	MP-100	PoseAnything-T	Mean PCK@0.2 - 1shot	87.47	# 2	Compare
2D Pose Estimation	MP-100	PoseAnything-T	Mean PCK@0.2 - 5shot	91.12	# 2	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • BPE • Dense Connections • Dropout • Graph Transformer • Label Smoothing • LapEigen • Laplacian PE • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer

Edit Social Preview

Pose Anything: A Graph-Based Approach for Category-Agnostic Pose Estimation

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove