TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Atomic action recognition	CATER	SCI3D	Average-mAP	96.77	# 1
Atomic action recognition	CATER	R3D-NL	Average-mAP	95.28	# 2
Atomic action recognition	CATER	FasterRCNN	Average-mAP	63.85	# 4
Composite action recognition	CATER	Single stream SCI3D	Average-mAP	69.76	# 1
Composite action recognition	CATER	FasterRCNN	Average-mAP	25.45	# 4
Composite action recognition	CATER	R3D-NL	Average-mAP	52.19	# 3
Composite action recognition	CATER	SCI3D	Average-mAP	66.71	# 2
Atomic action recognition	CATER	Single stream SCI3D	Average-mAP	91.82	# 3

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/deep-set-conditioned-latent-representations-1/atomic-action-recognition-on-cater)](https://paperswithcode.com/sota/atomic-action-recognition-on-cater?p=deep-set-conditioned-latent-representations-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/deep-set-conditioned-latent-representations-1/composite-action-recognition-on-cater)](https://paperswithcode.com/sota/composite-action-recognition-on-cater?p=deep-set-conditioned-latent-representations-1)`

Deep set conditioned latent representations for action recognition

International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications 2022 · Akash Singh, Tom De Schepper, Kevin Mets, Peter Hellinckx, Jose Oramas, Steven Latre ·

In recent years multi-label, multi-class video action recognition has gained significant popularity. While reasoning over temporally connected atomic actions is mundane for intelligent species, standard artificial neural networks (ANN) still struggle to classify them. In the real world, atomic actions often temporally connect to form more complex composite actions. The challenge lies in recognising composite action of varying durations while other distinct composite or atomic actions occur in the background. Drawing upon the success of relational networks, we propose methods that learn to reason over the semantic concept of objects and actions. We empirically show how ANNs benefit from pretraining, relational inductive biases and unordered set-based latent representations. In this paper we propose deep set conditioned I3D (SCI3D), a two stream relational network that employs latent representation of state and visual representation for reasoning over events and actions. They learn to reason about temporally connected actions in order to identify all of them in the video. The proposed method achieves an improvement of around 1.49% mAP in atomic action recognition and 17.57% mAP in composite action recognition, over a I3D-NL baseline, on the CATER dataset.

PDF Abstract International Joint 2022 PDF International Joint 2022 Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Action Recognition

Atomic action recognition

Composite action recognition

Temporal Action Localization

Datasets

UCF101

HMDB51

CLEVR

CATER

Results from the Paper

Edit

Ranked #1 on Atomic action recognition on CATER (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Atomic action recognition	CATER	SCI3D	Average-mAP	96.77	# 1	Compare
Atomic action recognition	CATER	R3D-NL	Average-mAP	95.28	# 2	Compare
Atomic action recognition	CATER	FasterRCNN	Average-mAP	63.85	# 4	Compare
Composite action recognition	CATER	Single stream SCI3D	Average-mAP	69.76	# 1	Compare
Composite action recognition	CATER	FasterRCNN	Average-mAP	25.45	# 4	Compare
Composite action recognition	CATER	R3D-NL	Average-mAP	52.19	# 3	Compare
Composite action recognition	CATER	SCI3D	Average-mAP	66.71	# 2	Compare
Atomic action recognition	CATER	Single stream SCI3D	Average-mAP	91.82	# 3	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Deep set conditioned latent representations for action recognition

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove