Video Classification

172 papers with code • 11 benchmarks • 17 datasets

Video Classification is the task of producing a label that is relevant to the video given its frames. A good video level classifier is one that not only provides accurate frame labels, but also best describes the entire video given the features and the annotations of the various frames in the video. For example, a video might contain a tree in some frame, but the label that is central to the video might be something else (e.g., “hiking”). The granularity of the labels that are needed to describe the frames and the video depends on the task. Typical tasks include assigning one or more global labels to the video, and assigning one or more labels for each frame inside the video.

Source: Efficient Large Scale Video Classification

Libraries

Use these libraries to find Video Classification models and implementations

Latest papers with no code

Deep Unsupervised Key Frame Extraction for Efficient Video Classification

no code yet • 12 Nov 2022

The proposed TSDPC is a generic and powerful framework and it has two advantages compared with previous works, one is that it can calculate the number of key frames automatically.

BOREx: Bayesian-Optimization--Based Refinement of Saliency Map for Image- and Video-Classification Models

no code yet • 31 Oct 2022

We propose a new black-box method BOREx (Bayesian Optimization for Refinement of visual model Explanation) to refine a heat map produced by any method.

Transfer-learning for video classification: Video Swin Transformer on multiple domains

no code yet • 18 Oct 2022

From the results, we conclude that VST generalizes well enough to classify out-of-domain videos without retraining when the target classes are from the same type as the classes used to train the model.

Linear Video Transformer with Feature Fixation

no code yet • 15 Oct 2022

Therefore, we propose a feature fixation module to reweight the feature importance of the query and key before computing linear attention.

FuTH-Net: Fusing Temporal Relations and Holistic Features for Aerial Video Classification

no code yet • 22 Sep 2022

Furthermore, the holistic features are refined by the multi-scale temporal relations in a novel fusion module for yielding more discriminative video representations.

Traffic Congestion Prediction using Deep Convolutional Neural Networks: A Color-coding Approach

no code yet • 16 Sep 2022

This work proposes a unique technique for traffic video classification using a color-coding scheme before training the traffic data in a Deep convolutional neural network.

On the Surprising Effectiveness of Transformers in Low-Labeled Video Recognition

no code yet • 15 Sep 2022

Our work empirically explores the low data regime for video classification and discovers that, surprisingly, transformers perform extremely well in the low-labeled video setting compared to CNNs.

UAV-CROWD: Violent and non-violent crowd activity simulator from the perspective of UAV

no code yet • 13 Aug 2022

Unmanned Aerial Vehicle (UAV) has gained significant traction in the recent years, particularly the context of surveillance.

Motion Sensitive Contrastive Learning for Self-supervised Video Representation

no code yet • 12 Aug 2022

Contrastive learning has shown great potential in video representation learning.

Two-Stream Transformer Architecture for Long Video Understanding

no code yet • 2 Aug 2022

Pure vision transformer architectures are highly effective for short video classification and action recognition tasks.