Action Recognition In Videos

64 papers with code • 17 benchmarks • 17 datasets

Action Recognition in Videos is a task in computer vision and pattern recognition where the goal is to identify and categorize human actions performed in a video sequence. The task involves analyzing the spatiotemporal dynamics of the actions and mapping them to a predefined set of action classes, such as running, jumping, or swimming.

Libraries

Use these libraries to find Action Recognition In Videos models and implementations
4 papers
3,904
3 papers
550
2 papers
3,000
See all 5 libraries.

MMNet: A Model-Based Multimodal Network for Human Action Recognition in RGB-D Videos

bruceyo/MMNet IEEE Transactions on Pattern Analysis and Machine Intelligence 2022

Upon aggregating the results of multiple modalities, our method is found to outperform state-of-the-art approaches on six evaluation protocols of the five datasets; thus, the proposed MMNet can effectively capture mutually complementary features in different RGB-D video modalities and provide more discriminative features for HAR.

25
26 May 2022

DirecFormer: A Directed Attention in Transformer Approach to Robust Action Recognition

uark-cviu/direcformer CVPR 2022

Various 3D-CNN based methods have been presented to tackle both the spatial and temporal dimensions in the task of video action recognition with competitive results.

24
19 Mar 2022

Self-supervised Video Transformer

kahnchana/svt CVPR 2022

To the best of our knowledge, the proposed approach is the first to alleviate the dependency on negative samples or dedicated memory banks in Self-supervised Video Transformer (SVT).

93
02 Dec 2021

Florence: A New Foundation Model for Computer Vision

microsoft/unicl 22 Nov 2021

Computer vision foundation models, which are trained on diverse, large-scale dataset and can be adapted to a wide range of downstream tasks, are critical for this mission to solve real-world computer vision applications.

369
22 Nov 2021

Logsig-RNN: a novel network for robust and efficient skeleton-based action recognition

steveliao93/gcn_logsigrnn 25 Oct 2021

In this paper, we propose a novel module, namely Logsig-RNN, which is the combination of the log-signature layer and recurrent type neural networks (RNNs).

11
25 Oct 2021

ActionCLIP: A New Paradigm for Video Action Recognition

towhee-io/towhee 17 Sep 2021

Moreover, to handle the deficiency of label texts and make use of tremendous web data, we propose a new paradigm based on this multimodal learning framework for action recognition, which we dub "pre-train, prompt and fine-tune".

3,000
17 Sep 2021

Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting

martinetoering/ViCC 18 Jun 2021

Instance-level contrastive learning techniques, which rely on data augmentation and a contrastive loss function, have found great success in the domain of visual representation learning.

37
18 Jun 2021

Space-time Mixing Attention for Video Transformer

1adrianb/video-transformers NeurIPS 2021

In this work, we propose a Video Transformer model the complexity of which scales linearly with the number of frames in the video sequence and hence induces no overhead compared to an image-based Transformer model.

46
10 Jun 2021

Multimodal Fusion via Teacher-Student Network for Indoor Action Recognition

bruceyo/TSMF Association for the Advancement of Artificial Intelligence (AAAI) 2021

In our TSMF, we utilize a teacher network to transfer the structural knowledge of the skeleton modality to a student network for the RGB modality.

20
18 May 2021

Learning Implicit Temporal Alignment for Few-shot Video Classification

tonysy/PyAction 11 May 2021

Few-shot video classification aims to learn new video categories with only a few labeled examples, alleviating the burden of costly annotation in real-world applications.

16
11 May 2021