Self-Supervised Action Recognition
34 papers with code • 6 benchmarks • 5 datasets
Latest papers
Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-supervised Action Recognition
In this paper, to make better use of the movement patterns introduced by extreme augmentations, a Contrastive Learning framework utilizing Abundant Information Mining for self-supervised action Representation (AimCLR) is proposed.
Self-Supervised Audio-Visual Representation Learning with Relaxed Cross-Modal Synchronicity
We present CrissCross, a self-supervised framework for learning audio-visual representations.
Learning the Predictability of the Future
We introduce a framework for learning from unlabeled video what is predictable in the future.
Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting
Instance-level contrastive learning techniques, which rely on data augmentation and a contrastive loss function, have found great success in the domain of visual representation learning.
A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning
We present a large-scale study on unsupervised spatiotemporal representation learning from videos.
Broaden Your Views for Self-Supervised Video Learning
Most successful self-supervised learning methods are trained to align the representations of two independent views from the data.
TCLR: Temporal Contrastive Learning for Video Representation
However, prior work on contrastive learning for video data has not explored the effect of explicitly encouraging the features to be distinct across the temporal dimension.
Pretext-Contrastive Learning: Toward Good Practices in Self-supervised Video Representation Leaning
It is convenient to treat PCL as a standard training strategy and apply it to many other works in self-supervised video feature learning.
RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning
We study unsupervised video representation learning that seeks to learn both motion and appearance features from unlabeled video only, which can be reused for downstream tasks such as action recognition.
Self-supervised Co-training for Video Representation Learning
The objective of this paper is visual-only self-supervised video representation learning.