Video Classification

175 papers with code • 11 benchmarks • 17 datasets

Video Classification is the task of producing a label that is relevant to the video given its frames. A good video level classifier is one that not only provides accurate frame labels, but also best describes the entire video given the features and the annotations of the various frames in the video. For example, a video might contain a tree in some frame, but the label that is central to the video might be something else (e.g., “hiking”). The granularity of the labels that are needed to describe the frames and the video depends on the task. Typical tasks include assigning one or more global labels to the video, and assigning one or more labels for each frame inside the video.

Source: Efficient Large Scale Video Classification


Use these libraries to find Video Classification models and implementations

Latest papers with no code

NetFlick: Adversarial Flickering Attacks on Deep Learning Based Video Compression

no code yet • 4 Apr 2023

Experimental results demonstrate that NetFlick can successfully deteriorate the performance of video compression frameworks in both digital- and physical-settings and can be further extended to attack downstream video classification networks.

Unified Keypoint-based Action Recognition Framework via Structured Keypoint Pooling

no code yet • CVPR 2023

A point cloud deep-learning paradigm is introduced to the action recognition, and a unified framework along with a novel deep neural network architecture called Structured Keypoint Pooling is proposed.

Selective Structured State-Spaces for Long-Form Video Understanding

no code yet • CVPR 2023

To address this limitation, we present a novel Selective S4 (i. e., S5) model that employs a lightweight mask generator to adaptively select informative image tokens resulting in more efficient and accurate modeling of long-term spatiotemporal dependencies in videos.

ViC-MAE: Self-Supervised Representation Learning from Images and Video with Contrastive Masked Autoencoders

no code yet • 21 Mar 2023

We show that visual representations learned under ViC-MAE generalize well to both video and image classification tasks.

Temporal Coherent Test-Time Optimization for Robust Video Classification

no code yet • 28 Feb 2023

To exploit information in video with self-supervised learning, TeCo uses global content from video clips and optimizes models for entropy minimization.

Video4MRI: An Empirical Study on Brain Magnetic Resonance Image Analytics with CNN-based Video Classification Frameworks

no code yet • 24 Feb 2023

Due to the high similarity between MRI data and videos, we conduct extensive empirical studies on video recognition techniques for MRI classification to answer the questions: (1) can we directly use video recognition models for MRI classification, (2) which model is more appropriate for MRI, (3) are the common tricks like data augmentation in video recognition still useful for MRI classification?

Analysis of Real-Time Hostile Activitiy Detection from Spatiotemporal Features Using Time Distributed Deep CNNs, RNNs and Attention-Based Mechanisms

no code yet • 21 Feb 2023

To eradicate this issue, intelligent surveillance systems can be built using deep learning video classification techniques that can help us automate surveillance systems to detect violence as it happens.

Few-Shot Video Classification via Representation Fusion and Promotion Learning

no code yet • ICCV 2023

This operation maximizes the contribution of discriminative frames to further capture the similarity of support and query samples from the same category.

Truncate-Split-Contrast: A Framework for Learning from Mislabeled Videos

no code yet • 27 Dec 2022

Thanks to Noise Contrastive Learning, the average classification accuracy improvement on Mini-Kinetics and Sth-Sth-V1 is over 1. 6\%.

VideoCoCa: Video-Text Modeling with Zero-Shot Transfer from Contrastive Captioners

no code yet • 9 Dec 2022

We explore an efficient approach to establish a foundational video-text model.