Action Recognition In Videos

64 papers with code • 17 benchmarks • 17 datasets

Action Recognition in Videos is a task in computer vision and pattern recognition where the goal is to identify and categorize human actions performed in a video sequence. The task involves analyzing the spatiotemporal dynamics of the actions and mapping them to a predefined set of action classes, such as running, jumping, or swimming.

Libraries

Use these libraries to find Action Recognition In Videos models and implementations
4 papers
3,904
3 papers
550
2 papers
3,000
See all 5 libraries.

Latest papers with no code

NAS-TC: Neural Architecture Search on Temporal Convolutions for Complex Action Recognition

no code yet • 17 Mar 2021

Accordingly, because of the automated design of its network structure, Neural architecture search (NAS) has achieved great success in the image processing field and attracted substantial research attention in recent years.

Temporal Difference Networks for Action Recognition

no code yet • 1 Jan 2021

To mitigate this issue, this paper presents a new video architecture, termed as Temporal Difference Network (TDN), with a focus on capturing multi-scale temporal information for efficient action recognition.

Developing Motion Code Embedding for Action Recognition in Videos

no code yet • 10 Dec 2020

In this work, we propose a motion embedding strategy known as motion codes, which is a vectorized representation of motions based on a manipulation's salient mechanical attributes.

Toward Accurate Person-level Action Recognition in Videos of Crowded Scenes

no code yet • 16 Oct 2020

Prior works always fail to deal with this problem in two aspects: (1) lacking utilizing information of the scenes; (2) lacking training data in the crowd and complex scenes.

Dynamic Sampling Networks for Efficient Action Recognition in Videos

no code yet • 28 Jun 2020

The existing action recognition methods are mainly based on clip-level classifiers such as two-stream CNNs or 3D CNNs, which are trained from the randomly selected clips and applied to densely sampled clips during testing.

Spatiotemporal Fusion in 3D CNNs: A Probabilistic View

no code yet • CVPR 2020

Based on the probability space, we further generate new fusion strategies which achieve the state-of-the-art performance on four well-known action recognition datasets.

TEA: Temporal Excitation and Aggregation for Action Recognition

no code yet • CVPR 2020

Temporal modeling is key for action recognition in videos.

Dynamic Inference: A New Approach Toward Efficient Video Action Recognition

no code yet • 9 Feb 2020

In a nutshell, we treat input frames and network depth of the computational graph as a 2-dimensional grid, and several checkpoints are placed on this grid in advance with a prediction module.

An Information-rich Sampling Technique over Spatio-Temporal CNN for Classification of Human Actions in Videos

no code yet • 6 Feb 2020

Traditionally in deep learning based human activity recognition approaches, either a few random frames or every $k^{th}$ frame of the video is considered for training the 3D CNN, where $k$ is a small positive integer, like 4, 5, or 6.

Skeleton based Activity Recognition by Fusing Part-wise Spatio-temporal and Attention Driven Residues

no code yet • 2 Dec 2019

There exist a wide range of intra class variations of the same actions and inter class similarity among the actions, at the same time, which makes the action recognition in videos very challenging.