Kinetics 400

Introduced by Kay et al. in The Kinetics Human Action Video Dataset

The dataset contains 400 human action classes, with at least 400 video clips for each action. Each clip lasts around 10s and is taken from a different YouTube video. The actions are human focussed and cover a broad range of classes including human-object interactions such as playing instruments, as well as human-human interactions such as shaking hands.

Source: https://arxiv.org/abs/1705.06950

Homepage

Benchmarks

Add a new result Link an existing benchmark

Task	Dataset Variant	Best Model
Action Classification	Kinetics-400	InternVideo2-6B
Self-Supervised Action Recognition	Kinetics-400	CVRL
Action Recognition In Videos	Kinetics-400	CAST-B/16
Boundary Detection	Kinetics-400	CASTANET+ Ensemble
Event Segmentation	Kinetics-400	CASTANET+ Ensemble
Skeleton Based Action Recognition	Kinetics-400	STGAT