Video Recognition

147 papers with code • 0 benchmarks • 10 datasets

Video Recognition is a process of obtaining, processing, and analysing data that it receives from a visual source, specifically video.

Benchmarks

Add a Result

These leaderboards are used to track progress in Video Recognition

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Libraries

Use these libraries to find Video Recognition models and implementations

open-mmlab/mmaction2

5 papers

3,908

open-mmlab/mmtracking

3 papers

3,384

facebookresearch/pytorchvideo

3 papers

3,186

towhee-io/towhee

3 papers

3,001

See all 9 libraries.

Datasets

Latest papers with no code

Most implemented Social Latest No code

Audio-Visual Glance Network for Efficient Video Recognition

no code yet • ICCV 2023

To address this issue, we propose Audio-Visual Glance Network (AVGN), which leverages the commonly available audio and visual modalities to efficiently process the spatio-temporally important parts of a video.

Paper
Add Code

On the Importance of Spatial Relations for Few-shot Action Recognition

no code yet • 14 Aug 2023

We are thus motivated to investigate the importance of spatial relations and propose a more accurate few-shot action recognition method that leverages both spatial and temporal information.

Paper
Add Code

View while Moving: Efficient Video Recognition in Long-untrimmed Videos

no code yet • 9 Aug 2023

To this end, inspired by human cognition, we propose a novel recognition paradigm of "View while Moving" for efficient long-untrimmed video recognition.

Paper
Add Code

TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter

no code yet • 22 Jun 2023

In situations involving system upgrades that require updating the upstream foundation model, it becomes essential to re-train all downstream modules to adapt to the new foundation model, which is inflexible and inefficient.

Paper
Add Code

Enhanced Multimodal Representation Learning with Cross-modal KD

no code yet • CVPR 2023

This paper explores the tasks of leveraging auxiliary modalities which are only available at training to enhance multimodal representation learning through cross-modal Knowledge Distillation (KD).

Paper
Add Code

A two-way translation system of Chinese sign language based on computer vision

no code yet • 3 Jun 2023

As the main means of communication for deaf people, sign language has a special grammatical order, so it is meaningful and valuable to develop a real-time translation system for sign language.

Paper
Add Code

Spatiotemporal Attention-based Semantic Compression for Real-time Video Recognition

no code yet • 22 May 2023

This paper studies the computational offloading of video action recognition in edge computing.

Paper
Add Code

Inter-frame Accelerate Attack against Video Interpolation Models

no code yet • 11 May 2023

We apply adversarial attacks to VIF models and find that the VIF models are very vulnerable to adversarial examples.

Paper
Add Code

Multi-object Video Generation from Single Frame Layouts

no code yet • 6 May 2023

In this paper, we study video synthesis with emphasis on simplifying the generation conditions.

Paper
Add Code

Efficient Decision-based Black-box Patch Attacks on Video Recognition

no code yet • ICCV 2023

First, STDE introduces target videos as patch textures and only adds patches on keyframes that are adaptively selected by temporal difference.

Paper
Add Code

Video Recognition

Benchmarks Add a Result

Libraries

Datasets

Latest papers with no code

Content

Benchmarks

Add a Result