TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Action Classification	Charades	2-Strm	MAP	18.6	# 48
Action Recognition	HMDB-51	Two-Stream (ImageNet pretrained)	Average accuracy of 3 splits	59.4	# 67
Action Recognition	UCF101	Two-Stream (ImageNet pretrained)	3-fold Accuracy	88.0	# 73
Hand Gesture Recognition	VIVA Hand Gestures Dataset	Two Stream CNNs	Accuracy	68	# 3

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/two-stream-convolutional-networks-for-action/hand-gesture-recognition-on-viva-hand-1)](https://paperswithcode.com/sota/hand-gesture-recognition-on-viva-hand-1?p=two-stream-convolutional-networks-for-action)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/two-stream-convolutional-networks-for-action/action-classification-on-charades)](https://paperswithcode.com/sota/action-classification-on-charades?p=two-stream-convolutional-networks-for-action)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/two-stream-convolutional-networks-for-action/action-recognition-in-videos-on-hmdb-51)](https://paperswithcode.com/sota/action-recognition-in-videos-on-hmdb-51?p=two-stream-convolutional-networks-for-action)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/two-stream-convolutional-networks-for-action/action-recognition-in-videos-on-ucf101)](https://paperswithcode.com/sota/action-recognition-in-videos-on-ucf101?p=two-stream-convolutional-networks-for-action)`

Two-Stream Convolutional Networks for Action Recognition in Videos

NeurIPS 2014 · Karen Simonyan, Andrew Zisserman ·

We investigate architectures of discriminatively trained deep Convolutional Networks (ConvNets) for action recognition in video. The challenge is to capture the complementary information on appearance from still frames and motion between frames. We also aim to generalise the best performing hand-crafted features within a data-driven learning framework. Our contribution is three-fold. First, we propose a two-stream ConvNet architecture which incorporates spatial and temporal networks. Second, we demonstrate that a ConvNet trained on multi-frame dense optical flow is able to achieve very good performance in spite of limited training data. Finally, we show that multi-task learning, applied to two different action classification datasets, can be used to increase the amount of training data and improve the performance on both. Our architecture is trained and evaluated on the standard video actions benchmarks of UCF-101 and HMDB-51, where it is competitive with the state of the art. It also exceeds by a large margin previous attempts to use deep nets for video classification.

PDF Abstract NeurIPS 2014 PDF NeurIPS 2014 Abstract

Code

Add Remove Mark official

feichtenhofer/twostreamfusion official

706

woodfrog/ActionRecognition

198

HsinYingLee/OPN

damien911224/theWorldInSafety

mcgridles/LENS

See all 7 implementations

Tasks

Add Remove

Action Classification

Action Recognition

Action Recognition In Videos

General Classification

Multi-Task Learning

Optical Flow Estimation

Temporal Action Localization

Video Classification

Vocal Bursts Valence Prediction

Datasets

UCF101

HMDB51

Charades

Sports-1M

VIVA

Results from the Paper

Edit

Ranked #3 on Hand Gesture Recognition on VIVA Hand Gestures Dataset

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Action Classification	Charades	2-Strm	MAP	18.6	# 48	Compare
Action Recognition	HMDB-51	Two-Stream (ImageNet pretrained)	Average accuracy of 3 splits	59.4	# 67	Compare
Action Recognition	UCF101	Two-Stream (ImageNet pretrained)	3-fold Accuracy	88.0	# 73	Compare
Hand Gesture Recognition	VIVA Hand Gestures Dataset	Two Stream CNNs	Accuracy	68	# 3	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Two-Stream Convolutional Networks for Action Recognition in Videos

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove