The Kinetics dataset is a large-scale, high-quality dataset for human action recognition in videos. The dataset consists of around 500,000 video clips covering 600 human action classes with at least 600 video clips for each action class. Each video clip lasts around 10 seconds and is labeled with a single action class. The videos are collected from YouTube.
1,182 PAPERS • 28 BENCHMARKS
TAPOS is a new dataset developed on sport videos with manual annotations of sub-actions, and conduct a study on temporal action parsing on top. A sport activity usually consists of multiple sub-actions and that the awareness of such temporal structures is beneficial to action recognition.
8 PAPERS • 1 BENCHMARK