TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Action Recognition	Something-Something V2	MML (ensemble)	Top-1 Accuracy	69.02	# 47
Action Recognition	Something-Something V2	MML (ensemble)	Top-5 Accuracy	92.70	# 24
Action Recognition	Something-Something V2	MML (single)	Top-1 Accuracy	66.83	# 73
Action Recognition	Something-Something V2	MML (single)	Top-5 Accuracy	91.30	# 38

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mutual-modality-learning-for-video-action/action-recognition-in-videos-on-something)](https://paperswithcode.com/sota/action-recognition-in-videos-on-something?p=mutual-modality-learning-for-video-action)`

Mutual Modality Learning for Video Action Classification

4 Nov 2020 · Stepan Komkov, Maksim Dzabraev, Aleksandr Petiushko ·

The construction of models for video action classification progresses rapidly. However, the performance of those models can still be easily improved by ensembling with the same models trained on different modalities (e.g. Optical flow). Unfortunately, it is computationally expensive to use several modalities during inference. Recent works examine the ways to integrate advantages of multi-modality into a single RGB-model. Yet, there is still a room for improvement. In this paper, we explore the various methods to embed the ensemble power into a single model. We show that proper initialization, as well as mutual modality learning, enhances single-modality models. As a result, we achieve state-of-the-art results in the Something-Something-v2 benchmark.

PDF Abstract

Code

Add Remove Mark official

papermsucode/mutual-modality-learni… official

Tasks

Add Remove

Action Classification

Action Recognition

Classification

General Classification

Optical Flow Estimation

Datasets

ImageNet

Kinetics

Kinetics 400

Charades

Something-Something V2

Results from the Paper

Edit

Ranked #47 on Action Recognition on Something-Something V2 (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Action Recognition	Something-Something V2	MML (ensemble)	Top-1 Accuracy	69.02	# 47	Compare
Action Recognition	Something-Something V2	MML (ensemble)	Top-5 Accuracy	92.70	# 24	Compare
Action Recognition	Something-Something V2	MML (single)	Top-1 Accuracy	66.83	# 73	Compare
Action Recognition	Something-Something V2	MML (single)	Top-5 Accuracy	91.30	# 38	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Mutual Modality Learning for Video Action Classification

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove