Browse SoTA > Computer Vision > Video > Action Classification

Action Classification

60 papers with code · Computer Vision
Subtask of Video

Benchmarks

Greatest papers with code

Large-scale weakly-supervised pre-training for video action recognition

CVPR 2019 microsoft/computervision-recipes

Second, frame-based models perform quite well on action recognition; is pre-training for good image features sufficient or is pre-training for spatio-temporal features valuable for optimal transfer learning?

ACTION CLASSIFICATION ACTION RECOGNITION ACTIVITY RECOGNITION IN VIDEOS EGOCENTRIC ACTIVITY RECOGNITION TRANSFER LEARNING

Omni-sourced Webly-supervised Learning for Video Recognition

ECCV 2020 open-mmlab/mmaction

Then a joint-training strategy is proposed to deal with the domain gaps between multiple data sources and formats in webly-supervised learning.

 Ranked #1 on Action Classification on Kinetics-400 (using extra training data)

ACTION CLASSIFICATION VIDEO RECOGNITION

Temporal Segment Networks for Action Recognition in Videos

8 May 2017open-mmlab/mmaction

Furthermore, based on the temporal segment networks, we won the video classification track at the ActivityNet challenge 2016 among 24 teams, which demonstrates the effectiveness of TSN and the proposed good practices.

Ranked #10 on Action Classification on Moments in Time (Top 5 Accuracy metric)

ACTION CLASSIFICATION ACTION RECOGNITION ACTION RECOGNITION IN VIDEOS ACTION RECOGNITION IN VIDEOS

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

CVPR 2017 deepmind/kinetics-i3d

The paucity of videos in current action classification datasets (UCF-101 and HMDB-51) has made it difficult to identify good video architectures, as most methods obtain similar performance on existing small-scale benchmarks.

ACTION CLASSIFICATION ACTION RECOGNITION SKELETON BASED ACTION RECOGNITION

FastGRNN: A Fast, Accurate, Stable and Tiny Kilobyte Sized Gated Recurrent Neural Network

NeurIPS 2018 Microsoft/EdgeML

FastRNN addresses these limitations by adding a residual connection that does not constrain the range of the singular values explicitly and has only two extra scalar parameters.

ACTION CLASSIFICATION LANGUAGE MODELLING SPEECH RECOGNITION TIME SERIES TIME SERIES CLASSIFICATION

Resource-efficient Machine Learning in 2 KB RAM for the Internet of Things

ICML 2017 Microsoft/EdgeML

This paper develops a novel tree-based algorithm, called Bonsai, for efficient prediction on IoT devices – such as those based on the Arduino Uno board having an 8 bit ATmega328P microcontroller operating at 16 MHz with no native floating point support, 2 KB RAM and 32 KB read-only flash.

ACTION CLASSIFICATION

What Makes Training Multi-Modal Classification Networks Hard?

CVPR 2020 facebookresearch/VMZ

Consider end-to-end training of a multi-modal vs. a single-modal network on a task with multiple input modalities: the multi-modal network receives more information, so it should match or outperform its single-modal counterpart.

ACTION CLASSIFICATION ACTION RECOGNITION

Video Classification with Channel-Separated Convolutional Networks

ICCV 2019 facebookresearch/R2Plus1D

It is natural to ask: 1) if group convolution can help to alleviate the high computational cost of video classification networks; 2) what factors matter the most in 3D group convolutional networks; and 3) what are good computation/accuracy trade-offs with 3D group convolutional networks.

ACTION CLASSIFICATION ACTION RECOGNITION IMAGE CLASSIFICATION