Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?

The purpose of this study is to determine whether current video datasets have sufficient data for training very deep convolutional neural networks (CNNs) with spatio-temporal three-dimensional (3D) kernels. Recently, the performance levels of 3D CNNs in the field of action recognition have improved significantly... (read more)

PDF Abstract CVPR 2018 PDF CVPR 2018 Abstract

Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT BENCHMARK
Action Recognition UCF101 ResNeXt-101 (64f) 3-fold Accuracy 94.5 # 16

Methods used in the Paper