Browse SoTA > Computer Vision > Video > Video Recognition

Video Recognition

18 papers with code · Computer Vision
Subtask of Video

Leaderboards

No evaluation results yet. Help compare methods by submit evaluation metrics.

Greatest papers with code

Would Mega-scale Datasets Further Enhance Spatiotemporal 3D CNNs?

10 Apr 2020kenshohara/3D-ResNets-PyTorch

Therefore, in the present paper, we conduct exploration study in order to improve spatiotemporal 3D CNNs as follows: (i) Recently proposed large-scale video datasets help improve spatiotemporal 3D CNNs in terms of video classification accuracy.

VIDEO CLASSIFICATION VIDEO RECOGNITION

Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution

ICCV 2019 osmr/imgclsmob

Similarly, the output feature maps of a convolution layer can also be seen as a mixture of information at different frequencies.

#34 best model for Image Classification on ImageNet (using extra training data)

IMAGE CLASSIFICATION VIDEO RECOGNITION

Deep Feature Flow for Video Recognition

CVPR 2017 msracver/Deep-Feature-Flow

Yet, it is non-trivial to transfer the state-of-the-art image recognition networks to videos as per-frame evaluation is too slow and unaffordable.

VIDEO RECOGNITION

Flow-Guided Feature Aggregation for Video Object Detection

ICCV 2017 msracver/Flow-Guided-Feature-Aggregation

The accuracy of detection suffers from degenerated object appearances in videos, e. g., motion blur, video defocus, rare poses, etc.

VIDEO OBJECT DETECTION VIDEO RECOGNITION

Weight Standardization

25 Mar 2019joe-siyuan-qiao/WeightStandardization

The micro-batch training setting is hard because small batch sizes are not enough for training networks with Batch Normalization (BN), while other normalization methods that do not rely on batch knowledge still have difficulty matching the performances of BN in large-batch training.

IMAGE CLASSIFICATION INSTANCE SEGMENTATION OBJECT DETECTION SEMANTIC SEGMENTATION VIDEO RECOGNITION

Improved Residual Networks for Image and Video Recognition

10 Apr 2020iduta/iresnet

We successfully train a 404-layer deep CNN on the ImageNet dataset and a 3002-layer network on CIFAR-10 and CIFAR-100, while the baseline is not able to converge at such extreme depths.

IMAGE CLASSIFICATION OBJECT DETECTION TEMPORAL ACTION LOCALIZATION VIDEO RECOGNITION

Clockwork Convnets for Video Semantic Segmentation

11 Aug 2016shelhamer/clockwork-fcn

Recent years have seen tremendous progress in still-image segmentation; however the na\"ive application of these state-of-the-art algorithms to every video frame requires considerable computation and ignores the temporal continuity inherent in video.

SEMANTIC SEGMENTATION VIDEO RECOGNITION VIDEO SEMANTIC SEGMENTATION