TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Weakly Supervised Action Localization	ActivityNet-1.2	CASE	mAP@0.5	43.8	# 3
Weakly Supervised Action Localization	ActivityNet-1.2	CASE	Mean mAP	27.9	# 1
Weakly Supervised Action Localization	ActivityNet-1.3	CASE	mAP@0.5	43.2	# 2
Weakly Supervised Action Localization	ActivityNet-1.3	CASE	mAP@0.5:0.95	26.8	# 2
Weakly Supervised Action Localization	THUMOS 2014	CASE	mAP@0.1:0.7	46.2	# 6
Weakly Supervised Action Localization	THUMOS 2014	CASE	mAP@0.1:0.5	57.1	# 6
Weakly Supervised Action Localization	THUMOS 2014	CASE + Zhou et al.	mAP@0.1:0.7	49.2	# 4

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/revisiting-foreground-and-background-1/weakly-supervised-action-localization-on-2)](https://paperswithcode.com/sota/weakly-supervised-action-localization-on-2?p=revisiting-foreground-and-background-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/revisiting-foreground-and-background-1/weakly-supervised-action-localization-on-1)](https://paperswithcode.com/sota/weakly-supervised-action-localization-on-1?p=revisiting-foreground-and-background-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/revisiting-foreground-and-background-1/weakly-supervised-action-localization-on)](https://paperswithcode.com/sota/weakly-supervised-action-localization-on?p=revisiting-foreground-and-background-1)`

Revisiting Foreground and Background Separation in Weakly-supervised Temporal Action Localization: A Clustering-based Approach

ICCV 2023 · Qinying Liu, Zilei Wang, Shenghai Rong, Junjie Li, Yixin Zhang ·

Weakly-supervised temporal action localization aims to localize action instances in videos with only video-level action labels. Existing methods mainly embrace a localization-by-classification pipeline that optimizes the snippet-level prediction with a video classification loss. However, this formulation suffers from the discrepancy between classification and detection, resulting in inaccurate separation of foreground and background (F\&B) snippets. To alleviate this problem, we propose to explore the underlying structure among the snippets by resorting to unsupervised snippet clustering, rather than heavily relying on the video classification loss. Specifically, we propose a novel clustering-based F\&B separation algorithm. It comprises two core components: a snippet clustering component that groups the snippets into multiple latent clusters and a cluster classification component that further classifies the cluster as foreground or background. As there are no ground-truth labels to train these two components, we introduce a unified self-labeling mechanism based on optimal transport to produce high-quality pseudo-labels that match several plausible prior distributions. This ensures that the cluster assignments of the snippets can be accurately associated with their F\&B labels, thereby boosting the F\&B separation. We evaluate our method on three benchmarks: THUMOS14, ActivityNet v1.2 and v1.3. Our method achieves promising performance on all three benchmarks while being significantly more lightweight than previous methods. Code is available at https://github.com/Qinying-Liu/CASE

PDF Abstract ICCV 2023 PDF ICCV 2023 Abstract

Code

Add Remove Mark official

qinying-liu/case official

100

Tasks

Add Remove

Action Localization

Classification

Clustering

Temporal Action Localization

Video Classification

Weakly Supervised Action Localization

Weakly-supervised Temporal Action Localization

Weakly Supervised Temporal Action Localization

Datasets

ActivityNet

THUMOS14

Results from the Paper

Add Remove

Ranked #1 on Weakly Supervised Action Localization on ActivityNet-1.2

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Weakly Supervised Action Localization	ActivityNet-1.2	CASE	mAP@0.5	43.8	# 3	Compare
Weakly Supervised Action Localization	ActivityNet-1.2	CASE	Mean mAP	27.9	# 1	Compare
Weakly Supervised Action Localization	ActivityNet-1.3	CASE	mAP@0.5	43.2	# 2	Compare
Weakly Supervised Action Localization	ActivityNet-1.3	CASE	mAP@0.5:0.95	26.8	# 2	Compare
Weakly Supervised Action Localization	THUMOS 2014	CASE	mAP@0.1:0.7	46.2	# 6	Compare
Weakly Supervised Action Localization	THUMOS 2014	CASE	mAP@0.1:0.5	57.1	# 6	Compare
Weakly Supervised Action Localization	THUMOS 2014	CASE + Zhou et al.	mAP@0.1:0.7	49.2	# 4	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Revisiting Foreground and Background Separation in Weakly-supervised Temporal Action Localization: A Clustering-based Approach

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove