Self-Supervised Action Recognition

34 papers with code • 6 benchmarks • 5 datasets

This task has no description! Would you like to contribute one?

Latest papers with no code

A Large-Scale Analysis on Self-Supervised Video Representation Learning

no code yet • 9 Jun 2023

In this work, we first provide a benchmark that enables a comparison of existing approaches on the same ground.

Self-supervised Contrastive Learning for Audio-Visual Action Recognition

no code yet • 28 Apr 2022

To learn supervised information from unlabeled videos, we propose a novel self-supervised contrastive learning module (SelfCL).

Human-Centered Prior-Guided and Task-Dependent Multi-Task Representation Learning for Action Recognition Pre-Training

no code yet • 27 Apr 2022

Recently, much progress has been made for self-supervised action recognition.

Self-Supervised Video Representation Learning with Meta-Contrastive Network

no code yet • ICCV 2021

Our method contains two training stages based on model-agnostic meta learning (MAML), each of which consists of a contrastive branch and a meta branch.

Self-Supervised Learning via multi-Transformation Classification for Action Recognition

no code yet • 20 Feb 2021

We use the learned models in pretext tasks as the pre-trained models and fine-tune them to recognize human actions in the downstream task.

Evolving Losses for Unsupervised Video Representation Learning

no code yet • CVPR 2020

We present a new method to learn video representations from large-scale unlabeled video data.

Skip-Clip: Self-Supervised Spatiotemporal Representation Learning by Future Clip Order Ranking

no code yet • 28 Oct 2019

Deep neural networks require collecting and annotating large amounts of data to train successfully.

Self-Supervised Spatiotemporal Learning via Video Clip Order Prediction

no code yet • CVPR 2019

Our method can learn the spatiotemporal representation of the video by predicting the order of shuffled clips from the video.

Self-Supervised Spatiotemporal Feature Learning via Video Rotation Prediction

no code yet • 28 Nov 2018

The success of deep neural networks generally requires a vast amount of training data to be labeled, which is expensive and unfeasible in scale, especially for video collections.

Self-Supervised Video Representation Learning with Space-Time Cubic Puzzles

no code yet • 24 Nov 2018

Self-supervised tasks such as colorization, inpainting and zigsaw puzzle have been utilized for visual representation learning for still images, when the number of labeled images is limited or absent at all.