Video Recognition

147 papers with code • 0 benchmarks • 10 datasets

Video Recognition is a process of obtaining, processing, and analysing data that it receives from a visual source, specifically video.

Benchmarks

Add a Result

These leaderboards are used to track progress in Video Recognition

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Libraries

Use these libraries to find Video Recognition models and implementations

open-mmlab/mmaction2

5 papers

3,924

open-mmlab/mmtracking

3 papers

3,386

facebookresearch/pytorchvideo

3 papers

3,193

towhee-io/towhee

3 papers

3,009

See all 9 libraries.

Datasets

Latest papers

Most implemented Social Latest No code

Open-VCLIP: Transforming CLIP to an Open-vocabulary Video Model via Interpolated Weight Optimization

wengzejia1/open-vclip • • 1 Feb 2023

Our framework extends CLIP with minimal modifications to model spatial-temporal relationships in videos, making it a specialized video classifier, while striving for generalization.

01 Feb 2023

Paper
Code

Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring

farewellthree/stan • • CVPR 2023

In this paper, based on the CLIP model, we revisit temporal modeling in the context of image-to-video knowledge transferring, which is the key point for extending image-text pretrained models to the video domain.

26 Jan 2023

Paper
Code

Efficient Robustness Assessment via Adversarial Spatial-Temporal Focus on Videos

deepsota/astfocus • • 3 Jan 2023

To implement this idea, we design the novel Adversarial spatial-temporal Focus (AstFocus) attack on videos, which performs attacks on the simultaneously focused key frames and key regions from the inter-frames and intra-frames in the video.

03 Jan 2023

Paper
Code

Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models

whwu95/Cap4Video • • CVPR 2023

In this paper, we propose a novel framework called BIKE, which utilizes the cross-modal bridge to explore bidirectional knowledge: i) We introduce the Video Attribute Association mechanism, which leverages the Video-to-Text knowledge to generate textual auxiliary attributes for complementing video recognition.

204

31 Dec 2022

Paper
Code

Efficient Movie Scene Detection using State-Space Transformers

md-mohaiminul/trans4mer • • CVPR 2023

Given a sequence of frames divided into movie shots (uninterrupted periods where the camera position does not change), the S4A block first applies self-attention to capture short-range intra-shot dependencies.

29 Dec 2022

Paper
Code

VLG: General Video Recognition with Web Textual Knowledge

mcg-nju/vlg • • 3 Dec 2022

Our VLG is first pre-trained on video and language datasets to learn a shared feature space, and then devises a flexible bi-modal attention head to collaborate high-level semantic concepts under different settings.

03 Dec 2022

Paper
Code

SVFormer: Semi-supervised Video Transformer for Action Recognition

chenhsing/svformer • • CVPR 2023

In this paper, we investigate the use of transformer models under the SSL setting for action recognition.

23 Nov 2022

Paper
Code

Look More but Care Less in Video Recognition

bespontaneous/afnet-pytorch • • 18 Nov 2022

To tackle this problem, we propose Ample and Focal Network (AFNet), which is composed of two branches to utilize more frames but with less computation.

18 Nov 2022

Paper
Code

Cluster and Aggregate: Face Recognition with Large Probe Set

mk-minchul/caface • • 19 Oct 2022

Advances in attention and recurrent modules have led to feature fusion that can model the relationship among the images in the input set.

19 Oct 2022

Paper
Code

Towards a Unified View on Visual Parameter-Efficient Transfer Learning

bruceyo/V-PETL • • 3 Oct 2022

Towards this goal, we propose a framework with a unified view of PETL called visual-PETL (V-PETL) to investigate the effects of different PETL techniques, data scales of downstream domains, positions of trainable parameters, and other aspects affecting the trade-off.

03 Oct 2022

Paper
Code

Video Recognition

Benchmarks Add a Result

Libraries

Datasets

Latest papers

Content

Benchmarks

Add a Result