Action Classification

227 papers with code • 24 benchmarks • 30 datasets

Libraries

Use these libraries to find Action Classification models and implementations

InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding

opengvlab/internvideo 22 Mar 2024

We introduce InternVideo2, a new video foundation model (ViFM) that achieves the state-of-the-art performance in action recognition, video-text tasks, and video-centric dialogue.

921
22 Mar 2024

Open-Vocabulary Video Relation Extraction

iriya99/ovre 25 Dec 2023

A comprehensive understanding of videos is inseparable from describing the action with its contextual action-object interactions.

10
25 Dec 2023

CAST: Cross-Attention in Space and Time for Video Action Recognition

khu-vll/cast NeurIPS 2023

In this work, we propose a novel two-stream architecture, called Cross-Attention in Space and Time (CAST), that achieves a balanced spatio-temporal understanding of videos using only RGB input.

27
30 Nov 2023

Just Add $π$! Pose Induced Video Transformers for Understanding Activities of Daily Living

dominickrei/pi-vit 30 Nov 2023

To facilitate the adoption of video transformers for ADL, we hypothesize that the augmentation of RGB with human pose information, known for its sensitivity to fine-grained motion and multiple viewpoints, is essential.

6
30 Nov 2023

Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning

whwu95/ATM 27 Nov 2023

In this paper, we present a novel Spatial-Temporal Side Network for memory-efficient fine-tuning large image models to video understanding, named Side4Video.

65
27 Nov 2023

ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video

leexinhao/ZeroI2V 2 Oct 2023

In this paper, our goal is to present a zero-cost adaptation paradigm (ZeroI2V) to transfer the image transformers to video recognition tasks (i. e., introduce zero extra cost to the adapted models during inference).

10
02 Oct 2023

MOFO: MOtion FOcused Self-Supervision for Video Understanding

moohnai/mofo 23 Aug 2023

Despite the importance of motion in supervised learning techniques for action recognition, SSL methods often do not explicitly consider motion information in videos.

7
23 Aug 2023

Progression-Guided Temporal Action Detection in Videos

makecent/apn 18 Aug 2023

The framework locates actions in videos by detecting the action evolution process.

1
18 Aug 2023

Temporally-Adaptive Models for Efficient Video Understanding

alibaba-mmai-research/TAdaConv 10 Aug 2023

Spatial convolutions are extensively used in numerous deep video models.

215
10 Aug 2023

Actor-agnostic Multi-label Action Recognition with Multi-modal Query

mondalanindya/msqnet 20 Jul 2023

Existing action recognition methods are typically actor-specific due to the intrinsic topological and apparent differences among the actors.

16
20 Jul 2023