Action Localization

131 papers with code • 0 benchmarks • 3 datasets

Action Localization is finding the spatial and temporal co ordinates for an action in a video. An action localization model will identify which frame an action start and ends in video and return the x,y coordinates of an action. Further the co ordinates will change when the object performing action undergoes a displacement.

Libraries

Use these libraries to find Action Localization models and implementations

Revisiting Foreground and Background Separation in Weakly-supervised Temporal Action Localization: A Clustering-based Approach

qinying-liu/case ICCV 2023

It comprises two core components: a snippet clustering component that groups the snippets into multiple latent clusters and a cluster classification component that further classifies the cluster as foreground or background.

98
21 Dec 2023

Unsupervised Temporal Action Localization via Self-paced Incremental Learning

tanghaoyu258/feel 12 Dec 2023

Thereafter, we design two (constant- and variable- speed) incremental instance learning strategies for easy-to-hard model training, thus ensuring the reliability of these video pseudolabels and further improving overall localization performance.

3
12 Dec 2023

GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation

zzxslp/mm-navigator 13 Nov 2023

We first benchmark MM-Navigator on our collected iOS screen dataset.

102
13 Nov 2023

Temporal Action Localization with Enhanced Instant Discriminability

sssste/tridet 11 Sep 2023

Temporal action detection (TAD) aims to detect all action boundaries and their corresponding categories in an untrimmed video.

143
11 Sep 2023

HR-Pro: Point-supervised Temporal Action Localization via Hierarchical Reliability Propagation

pipixin321/hr-pro 24 Aug 2023

For snippet-level learning, we introduce an online-updated memory to store reliable snippet prototypes for each class.

18
24 Aug 2023

DDG-Net: Discriminability-Driven Graph Network for Weakly-supervised Temporal Action Localization

xiaojuntang22/iccv2023-ddgnet ICCV 2023

Considering this phenomenon, we propose Discriminability-Driven Graph Network (DDG-Net), which explicitly models ambiguous snippets and discriminative snippets with well-designed connections, preventing the transmission of ambiguous information and enhancing the discriminability of snippet-level representations.

10
31 Jul 2023

NMS Threshold matters for Ego4D Moment Queries -- 2nd place solution to the Ego4D Moment Queries Challenge 2023

happyharrycn/actionformer_release 5 Jul 2023

This report describes our submission to the Ego4D Moment Queries Challenge 2023.

374
05 Jul 2023

Proposal-Based Multiple Instance Learning for Weakly-Supervised Temporal Action Localization

RenHuan1999/CVPR2023_P-MIL CVPR 2023

Weakly-supervised temporal action localization aims to localize and recognize actions in untrimmed videos with only video-level category labels during training.

25
29 May 2023

Boosting Weakly-Supervised Temporal Action Localization with Text Information

lgzlilili/boosting-wtal CVPR 2023

For the discriminative objective, we propose a Text-Segment Mining (TSM) mechanism, which constructs a text description based on the action class label, and regards the text as the query to mine all class-related segments.

33
01 May 2023

Weakly-Supervised Temporal Action Localization with Bidirectional Semantic Consistency Constraint

lgzlilili/biscc 25 Apr 2023

The proposed Bi-SCC firstly adopts a temporal context augmentation to generate an augmented video that breaks the correlation between positive actions and their co-scene actions in the inter-video; Then, a semantic consistency constraint (SCC) is used to enforce the predictions of the original video and augmented video to be consistent, hence suppressing the co-scene actions.

7
25 Apr 2023