Spatio-Temporal Action Localization

13 papers with code • 1 benchmarks • 6 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Spatio-Temporal Action Localization

Trend	Dataset	Best Model	Paper	Code	Compare
	AVA-Kinetics	VideoMAE V2-g			See all

Datasets

Most implemented papers

Most implemented Social Latest No code

E^2TAD: An Energy-Efficient Tracking-based Action Detector

VITA-Group/21LPCV-UAV-Solution • • 9 Apr 2022

Video action detection (spatio-temporal action localization) is usually the starting point for human-centric intelligent analysis of videos nowadays.

Paper
Code

Unmasked Teacher: Towards Training-Efficient Video Foundation Models

opengvlab/unmasked_teacher • • ICCV 2023

Previous VFMs rely on Image Foundation Models (IFMs), which face challenges in transferring to the video domain.

Paper
Code

VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking

OpenGVLab/VideoMAEv2 • • CVPR 2023

Finally, we successfully train a video ViT model with a billion parameters, which achieves a new state-of-the-art performance on the datasets of Kinetics (90. 0% on K400 and 89. 9% on K600) and Something-Something (68. 7% on V1 and 77. 0% on V2).

Paper
Code

Spatio-Temporal Action Localization

Benchmarks Add a Result

Datasets

Most implemented papers

E^2TAD: An Energy-Efficient Tracking-based Action Detector

Unmasked Teacher: Towards Training-Efficient Video Foundation Models

VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking

Content

Benchmarks

Add a Result