Video Recognition

147 papers with code • 0 benchmarks • 10 datasets

Video Recognition is a process of obtaining, processing, and analysing data that it receives from a visual source, specifically video.

Libraries

Use these libraries to find Video Recognition models and implementations
5 papers
3,888
3 papers
2,987
See all 9 libraries.

Most implemented papers

Revisiting 3D ResNets for Video Recognition

tensorflow/models 3 Sep 2021

A recent work from Bello shows that training and scaling strategies may be more significant than model architectures for visual recognition.

Revisiting Classifier: Transferring Vision-Language Models for Video Recognition

whwu95/text4vis 4 Jul 2022

In this study, we focus on transferring knowledge for video classification tasks.

Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models

whwu95/BIKE CVPR 2023

In this paper, we propose a novel framework called BIKE, which utilizes the cross-modal bridge to explore bidirectional knowledge: i) We introduce the Video Attribute Association mechanism, which leverages the Video-to-Text knowledge to generate textual auxiliary attributes for complementing video recognition.

TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?

google-research/scenic 21 Jun 2021

In this paper, we introduce a novel visual representation learning which relies on a handful of adaptively learned tokens, and which is applicable to both image and video understanding tasks.

TSM: Temporal Shift Module for Efficient and Scalable Video Understanding on Edge Device

MIT-HAN-LAB/temporal-shift-module 27 Sep 2021

Secondly, TSM has high efficiency; it achieves a high frame rate of 74fps and 29fps for online video recognition on Jetson Nano and Galaxy Note8.

Deep Feature Flow for Video Recognition

msracver/Deep-Feature-Flow CVPR 2017

Yet, it is non-trivial to transfer the state-of-the-art image recognition networks to videos as per-frame evaluation is too slow and unaffordable.

Audiovisual SlowFast Networks for Video Recognition

facebookresearch/SlowFast 23 Jan 2020

We present Audiovisual SlowFast Networks, an architecture for integrated audiovisual perception.

Omni-sourced Webly-supervised Learning for Video Recognition

open-mmlab/mmaction ECCV 2020

Then a joint-training strategy is proposed to deal with the domain gaps between multiple data sources and formats in webly-supervised learning.

Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition

iduta/pyconv 20 Jun 2020

This work introduces pyramidal convolution (PyConv), which is capable of processing the input at multiple filter scales.

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

whwu95/MVFNet 13 Dec 2020

Existing state-of-the-art methods have achieved excellent accuracy regardless of the complexity meanwhile efficient spatiotemporal modeling solutions are slightly inferior in performance.