Large-scale weakly-supervised pre-training for video action recognition

Current fully-supervised video datasets consist of only a few hundred thousand videos and fewer than a thousand domain-specific labels. This hinders the progress towards advanced video architectures... (read more)

PDF Abstract CVPR 2019 PDF CVPR 2019 Abstract

Results from the Paper


 Ranked #1 on Egocentric Activity Recognition on EPIC-Kitchens (Actions Top-1 (S2) metric)

     Get a GitHub badge
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK USES EXTRA
TRAINING DATA
RESULT BENCHMARK
Egocentric Activity Recognition EPIC-Kitchens R(2+1)D-152-SE (ig) Actions Top-1 (S2) 25.6 # 1
Egocentric Activity Recognition EPIC-Kitchens R(2+1)D-34 (kinetics) Actions Top-1 (S2) 16.8 # 5
Action Classification Kinetics-400 irCSN-152 (IG-Kinetics-65M pretrain) Accuracy 82.8 # 2
Action Recognition Kinetics-400 R(2+1)D-152* Video [email protected] 81.3 # 1
Video [email protected] 95.1 # 1

Methods used in the Paper


METHOD TYPE
🤖 No Methods Found Help the community by adding them if they're not listed; e.g. Deep Residual Learning for Image Recognition uses ResNet