TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Temporal Action Localization	THUMOS’14	ASL(I3D features)	mAP IOU@0.5	71.7	# 7
Temporal Action Localization	THUMOS’14	ASL(I3D features)	mAP IOU@0.3	83.1	# 6
Temporal Action Localization	THUMOS’14	ASL(I3D features)	mAP IOU@0.4	79.0	# 6
Temporal Action Localization	THUMOS’14	ASL(I3D features)	mAP IOU@0.6	59.7	# 7
Temporal Action Localization	THUMOS’14	ASL(I3D features)	mAP IOU@0.7	45.8	# 6
Temporal Action Localization	THUMOS’14	ASL(I3D features)	Avg mAP (0.3:0.7)	67.9	# 9

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/action-sensitivity-learning-for-temporal/temporal-action-localization-on-thumos14)](https://paperswithcode.com/sota/temporal-action-localization-on-thumos14?p=action-sensitivity-learning-for-temporal)`

Action Sensitivity Learning for Temporal Action Localization

ICCV 2023 · Jiayi Shao, Xiaohan Wang, Ruijie Quan, Junjun Zheng, Jiang Yang, Yi Yang ·

Temporal action localization (TAL), which involves recognizing and locating action instances, is a challenging task in video understanding. Most existing approaches directly predict action classes and regress offsets to boundaries, while overlooking the discrepant importance of each frame. In this paper, we propose an Action Sensitivity Learning framework (ASL) to tackle this task, which aims to assess the value of each frame and then leverage the generated action sensitivity to recalibrate the training procedure. We first introduce a lightweight Action Sensitivity Evaluator to learn the action sensitivity at the class level and instance level, respectively. The outputs of the two branches are combined to reweight the gradient of the two sub-tasks. Moreover, based on the action sensitivity of each frame, we design an Action Sensitive Contrastive Loss to enhance features, where the action-aware frames are sampled as positive pairs to push away the action-irrelevant frames. The extensive studies on various action localization benchmarks (i.e., MultiThumos, Charades, Ego4D-Moment Queries v1.0, Epic-Kitchens 100, Thumos14 and ActivityNet1.3) show that ASL surpasses the state-of-the-art in terms of average-mAP under multiple types of scenarios, e.g., single-labeled, densely-labeled and egocentric.

PDF Abstract ICCV 2023 PDF ICCV 2023 Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Action Localization

Moment Queries

Temporal Action Localization

Video Understanding

Datasets

ActivityNet

Charades

THUMOS14

EPIC-KITCHENS-100

MultiTHUMOS

Results from the Paper

Edit

Ranked #9 on Temporal Action Localization on THUMOS’14

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Temporal Action Localization	THUMOS’14	ASL(I3D features)	mAP IOU@0.5	71.7	# 7	Compare
			mAP IOU@0.3	83.1	# 6	Compare
			mAP IOU@0.4	79.0	# 6	Compare
			mAP IOU@0.6	59.7	# 7	Compare
			mAP IOU@0.7	45.8	# 6	Compare
			Avg mAP (0.3:0.7)	67.9	# 9	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Action Sensitivity Learning for Temporal Action Localization

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove