Temporal Extension Module for Skeleton-Based Action Recognition

arXiv 2020  ·  Yuya Obinata, Takuma Yamamoto ·

We present a module that extends the temporal graph of a graph convolutional network (GCN) for action recognition with a sequence of skeletons. Existing methods attempt to represent a more appropriate spatial graph on an intra-frame, but disregard optimization of the temporal graph on the interframe. Concretely, these methods connect between vertices corresponding only to the same joint on the inter-frame. In this work, we focus on adding connections to neighboring multiple vertices on the inter-frame and extracting additional features based on the extended temporal graph. Our module is a simple yet effective method to extract correlated features of multiple joints in human movement. Moreover, our module aids in further performance improvements, along with other GCN methods that optimize only the spatial graph. We conduct extensive experiments on two large datasets, NTU RGB+D and Kinetics-Skeleton, and demonstrate that our module is effective for several existing models and our final model achieves state-of-the-art performance.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Skeleton Based Action Recognition Kinetics-Skeleton dataset 2s-AGCN+TEM Accuracy 38.6 # 7
Skeleton Based Action Recognition NTU RGB+D MS-AAGCN+TEM Accuracy (CV) 96.5 # 22
Accuracy (CS) 91.0 # 31

Methods