Vertex Feature Encoding and Hierarchical Temporal Modeling in a Spatial-Temporal Graph Convolutional Network for Action Recognition

20 Dec 2019 β€’ Konstantinos Papadopoulos β€’ Enjie Ghorbel β€’ Djamila Aouada β€’ BjΓΆrn Ottersten

This paper extends the Spatial-Temporal Graph Convolutional Network (ST-GCN) for skeleton-based action recognition by introducing two novel modules, namely, the Graph Vertex Feature Encoder (GVFE) and the Dilated Hierarchical Temporal Convolutional Network (DH-TCN). On the one hand, the GVFE module learns appropriate vertex features for action recognition by encoding raw skeleton data into a new feature space... (read more)

PDF Abstract

Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT BENCHMARK
Skeleton Based Action Recognition NTU RGB+D GVFE+ AS-GCN with DH-TCN Accuracy (CV) 92.8 # 24
Accuracy (CS) 85.3 # 31
Skeleton Based Action Recognition NTU RGB+D 120 GVFE + AS-GCN with DH-TCN Accuracy (Cross-Subject) 78.3% # 12
Accuracy (Cross-Setup) 79.8% # 12
Action Recognition NTU RGB+D 120 ST-GCN + AS-GCN w/DH-TCN Accuracy (Cross-Subject) 79.2 # 1
Accuracy (Cross-Setup) 78.3 # 1

Methods used in the Paper


METHOD TYPE
πŸ€– No Methods Found Help the community by adding them if they're not listed; e.g. Deep Residual Learning for Image Recognition uses ResNet