Action Detection
233 papers with code • 11 benchmarks • 33 datasets
Action Detection aims to find both where and when an action occurs within a video clip and classify what the action is taking place. Typically results are given in the form of action tublets, which are action bounding boxes linked across time in the video. This is related to temporal localization, which seeks to identify the start and end frame of an action, and action recognition, which seeks only to classify which action is taking place and typically assumes a trimmed video.
Libraries
Use these libraries to find Action Detection models and implementationsDatasets
Subtasks
Latest papers
Long-term Conversation Analysis: Exploring Utility and Privacy
The analysis of conversations recorded in everyday life requires privacy protection.
E2E-LOAD: End-to-End Long-form Online Action Detection
Furthermore, we propose a novel and efficient inference mechanism that accelerates heavy spatial-temporal exploration.
ShuttleSet: A Human-Annotated Stroke-Level Singles Dataset for Badminton Tactical Analysis
With the recent progress in sports analytics, deep learning approaches have demonstrated the effectiveness of mining insights into players' tactics for improving performance quality and fan engagement.
FunASR: A Fundamental End-to-End Speech Recognition Toolkit
FunASR offers models trained on large-scale industrial corpora and the ability to deploy them in applications.
Efficient Video Action Detection with Token Dropout and Context Refinement
Our EVAD consists of two specialized designs for video action detection.
WEAR: An Outdoor Sports Dataset for Wearable and Egocentric Activity Recognition
Though research has shown the complementarity of camera- and inertial-based data, datasets which offer both egocentric video and inertial-based sensor data remain scarce.
Interaction-Aware Prompting for Zero-Shot Spatio-Temporal Action Detection
Finally, we calculate the similarity between the interaction feature and the text feature for each label to determine the action category.
Boundary-Denoising for Video Activity Localization
To alleviate the boundary ambiguity, we propose to study the video activity localization problem from a denoising perspective.
Evaluation of Noise Reduction Methods for Sentence Recognition by Sinhala Speaking Listeners
Noise reduction is a crucial aspect of hearing aids, which researchers have been striving to address over the years.
DiffTAD: Temporal Action Detection with Proposal Denoising Diffusion
Concretely, we establish the denoising process in the Transformer decoder (e. g., DETR) by introducing a temporal location query design with faster convergence in training.