Action Detection

233 papers with code • 11 benchmarks • 33 datasets

Action Detection aims to find both where and when an action occurs within a video clip and classify what the action is taking place. Typically results are given in the form of action tublets, which are action bounding boxes linked across time in the video. This is related to temporal localization, which seeks to identify the start and end frame of an action, and action recognition, which seeks only to classify which action is taking place and typically assumes a trimmed video.

Libraries

Use these libraries to find Action Detection models and implementations
6 papers
3,888
2 papers
2,987
See all 6 libraries.

TIM: A Time Interval Machine for Audio-Visual Action Recognition

faceonlive/ai-research 8 Apr 2024

We address the interplay between the two modalities in long videos by explicitly modelling the temporal extents of audio and visual events.

152
08 Apr 2024

UniMD: Towards Unifying Moment Retrieval and Temporal Action Detection

yingsen1/unimd 7 Apr 2024

Temporal Action Detection (TAD) focuses on detecting pre-defined actions, while Moment Retrieval (MR) aims to identify the events described by open-ended natural language within untrimmed videos.

7
07 Apr 2024

Online speaker diarization of meetings guided by speech separation

egruttadauria98/sspavaldo 30 Jan 2024

The results show that our system improves the state-of-the-art on the AMI headset mix, using no oracle information and under full evaluation (no collar and including overlapped speech).

16
30 Jan 2024

Glance and Focus: Memory Prompting for Multi-Event Video Question Answering

byz0e/glance-focus NeurIPS 2023

Instead of that, we train an Encoder-Decoder to generate a set of dynamic event memories at the glancing stage.

17
03 Jan 2024

Generative Model-based Feature Knowledge Distillation for Action Recognition

aaai-24/generative-based-kd 14 Dec 2023

Addressing this gap, our paper introduces an innovative knowledge distillation framework, with the generative model for training a lightweight student model.

8
14 Dec 2023

Advanced Image Segmentation Techniques for Neural Activity Detection via C-fos Immediate Early Gene Expression

dystopians/cfoscraft 13 Dec 2023

This research contributes to the development of more efficient and automated image segmentation methods, advancing the understanding of neural function in neuroscience research.

0
13 Dec 2023

Semi-supervised Active Learning for Video Action Detection

akash2907/semi-sup-active-learning 12 Dec 2023

First, we demonstrate its effectiveness on video action detection where the proposed approach outperforms prior works in semi-supervised and weakly-supervised learning along with several baseline approaches in both UCF101-24 and JHMDB-21.

0
12 Dec 2023

End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames

sming256/OpenTAD 28 Nov 2023

In this paper, we reduce the memory consumption for end-to-end training, and manage to scale up the TAD backbone to 1 billion parameters and the input video to 1, 536 frames, leading to significant detection performance.

64
28 Nov 2023

Centre Stage: Centricity-based Audio-Visual Temporal Action Detection

hanielwang/audio-visual-tad 28 Nov 2023

Previous one-stage action detection approaches have modelled temporal dependencies using only the visual modality.

2
28 Nov 2023

ChimpACT: A Longitudinal Dataset for Understanding Chimpanzee Behaviors

shirleymaxx/chimpact NeurIPS 2023

ChimpACT is both comprehensive and challenging, consisting of 163 videos with a cumulative 160, 500 frames, each richly annotated with detection, identification, pose estimation, and fine-grained spatiotemporal behavior labels.

10
25 Oct 2023