TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Skeleton Based Action Recognition	NTU RGB+D	Clips+CNN+MTLN	Accuracy (CV)	84.8	# 98
Skeleton Based Action Recognition	NTU RGB+D	Clips+CNN+MTLN	Accuracy (CS)	79.6	# 99
Skeleton Based Action Recognition	NTU RGB+D 120	Multi-Task Learning Network	Accuracy (Cross-Subject)	58.4%	# 65
Skeleton Based Action Recognition	NTU RGB+D 120	Multi-Task Learning Network	Accuracy (Cross-Setup)	57.9%	# 66

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-new-representation-of-skeleton-sequences/skeleton-based-action-recognition-on-ntu-rgbd-1)](https://paperswithcode.com/sota/skeleton-based-action-recognition-on-ntu-rgbd-1?p=a-new-representation-of-skeleton-sequences)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-new-representation-of-skeleton-sequences/skeleton-based-action-recognition-on-ntu-rgbd)](https://paperswithcode.com/sota/skeleton-based-action-recognition-on-ntu-rgbd?p=a-new-representation-of-skeleton-sequences)`

A New Representation of Skeleton Sequences for 3D Action Recognition

CVPR 2017 · Qiuhong Ke, Mohammed Bennamoun, Senjian An, Ferdous Sohel, Farid Boussaid ·

This paper presents a new method for 3D action recognition with skeleton sequences (i.e., 3D trajectories of human skeleton joints). The proposed method first transforms each skeleton sequence into three clips each consisting of several frames for spatial temporal feature learning using deep neural networks. Each clip is generated from one channel of the cylindrical coordinates of the skeleton sequence. Each frame of the generated clips represents the temporal information of the entire skeleton sequence, and incorporates one particular spatial relationship between the joints. The entire clips include multiple frames with different spatial relationships, which provide useful spatial structural information of the human skeleton. We propose to use deep convolutional neural networks to learn long-term temporal information of the skeleton sequence from the frames of the generated clips, and then use a Multi-Task Learning Network (MTLN) to jointly process all frames of the generated clips in parallel to incorporate spatial structural information for action recognition. Experimental results clearly show the effectiveness of the proposed new representation and feature learning method for 3D action recognition.

PDF Abstract CVPR 2017 PDF CVPR 2017 Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

3D Action Recognition

Action Recognition

Multi-Task Learning

Skeleton Based Action Recognition

Temporal Action Localization

Datasets

NTU RGB+D

NTU RGB+D 120

SBU

Results from the Paper

Edit

Ranked #65 on Skeleton Based Action Recognition on NTU RGB+D 120

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Skeleton Based Action Recognition	NTU RGB+D	Clips+CNN+MTLN	Accuracy (CV)	84.8	# 98	Compare
Skeleton Based Action Recognition	NTU RGB+D	Clips+CNN+MTLN	Accuracy (CS)	79.6	# 99	Compare
Skeleton Based Action Recognition	NTU RGB+D 120	Multi-Task Learning Network	Accuracy (Cross-Subject)	58.4%	# 65	Compare
Skeleton Based Action Recognition	NTU RGB+D 120	Multi-Task Learning Network	Accuracy (Cross-Setup)	57.9%	# 66	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

A New Representation of Skeleton Sequences for 3D Action Recognition

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove