Temporal Decoupling Graph Convolutional Network for Skeleton-based Gesture Recognition
Skeleton-based gesture recognition methods have achieved high success using Graph Convolutional Network (GCN), which commonly uses an adjacency matrix to model the spatial topology of skeletons. However, previous methods use the same adjacency matrix for skeletons from different frames, which limits the flexibility of GCN to model temporal information. To solve this problem, we propose a Temporal Decoupling Graph Convolutional Network (TD-GCN), which applies different adjacency matrices for skeletons from different frames. The main steps of each convolution layer in our proposed TD-GCN are as follows. To extract deep spatiotemporal information from skeleton joints, we first extract high-level spatiotemporal features from skeleton data. Then, channel-dependent and temporal-dependent adjacency matrices corresponding to different channels and frames are calculated to capture the spatiotemporal dependencies between skeleton joints. Finally, to fuse topology information from neighbor skeleton joints, spatiotemporal features of skeleton joints are fused based on channel-dependent and temporal-dependent adjacency matrices. To the best of our knowledge, we are the first to use temporal-dependent adjacency matrices for temporal-sensitive topology learning from skeleton joints. The proposed TD-GCN effectively improves the modeling ability of GCN and achieves state-of-the-art results on gesture datasets including SHREC'17 Track and DHG-14/28. Our code is available at: https://github.com/liujf69/TD-GCN-Gesture .
PDF AbstractCode
Results from the Paper
Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Benchmark |
---|---|---|---|---|---|---|
Hand Gesture Recognition | DHG-14 | TD-GCN | Accuracy | 93.9 | # 2 | |
Hand Gesture Recognition | DHG-28 | TD-GCN | Accuracy | 91.4 | # 3 | |
Skeleton Based Action Recognition | NTU RGB+D | TD-GCN | Accuracy (CV) | 96.8 | # 17 | |
Accuracy (CS) | 92.8 | # 16 | ||||
Ensembled Modalities | 4 | # 2 | ||||
Skeleton Based Action Recognition | N-UCLA | TD-GCN | Accuracy | 97.4 | # 3 | |
Skeleton Based Action Recognition | SHREC 2017 track on 3D Hand Gesture Recognition | TD-GCN | 28 gestures accuracy | 95.36 | # 1 | |
14 gestures accuracy | 97.02 | # 1 |