Temporal Action Localization
422 papers with code • 14 benchmarks • 42 datasets
Temporal Action Localization aims to detect activities in the video stream and output beginning and end timestamps. It is closely related to Temporal Action Proposal Generation.
Libraries
Use these libraries to find Temporal Action Localization models and implementationsDatasets
Subtasks
Most implemented papers
Representation Flow for Action Recognition
Our representation flow layer is a fully-differentiable layer designed to capture the `flow' of any representation channel within a convolutional neural network for action recognition.
Explaining NonLinear Classification Decisions with Deep Taylor Decomposition
Although our focus is on image classification, the method is applicable to a broad set of input data, learning tasks and network architectures.
TS-LSTM and Temporal-Inception: Exploiting Spatiotemporal Dynamics for Activity Recognition
We demonstrate that using both RNNs (using LSTMs) and Temporal-ConvNets on spatiotemporal feature matrices are able to exploit spatiotemporal dynamics to improve the overall performance.
Im2Flow: Motion Hallucination from Static Images for Action Recognition
Second, we show the power of hallucinated flow for recognition, successfully transferring the learned motion into a standard two-stream network for activity recognition.
Moments in Time Dataset: one million videos for event understanding
We present the Moments in Time Dataset, a large-scale human-annotated collection of one million short videos corresponding to dynamic events unfolding within three seconds.
Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition
In addition, the second-order information (the lengths and directions of bones) of the skeleton data, which is naturally more informative and discriminative for action recognition, is rarely investigated in existing methods.
What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment
Can performance on the task of action quality assessment (AQA) be improved by exploiting a description of the action and its quality?
Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition
Deep neural networks based purely on attention have been successful across several domains, relying on minimal architectural priors from the designer.
Action Recognition with Dynamic Image Networks
This is a powerful idea because it allows to convert any video to an image so that existing CNN models pre-trained for the analysis of still images can be immediately extended to videos.
Hidden Two-Stream Convolutional Networks for Action Recognition
State-of-the-art action recognition approaches rely on traditional optical flow estimation methods to pre-compute motion information for CNNs.