Action Understanding
24 papers with code • 1 benchmarks • 4 datasets
Most implemented papers
Few-Shot Fine-Grained Action Recognition via Bidirectional Attention and Contrastive Meta-Learning
Fine-grained action recognition is attracting increasing attention due to the emerging demand of specific action understanding in real-world applications, whereas the data of rare fine-grained categories is very limited.
Video Pose Distillation for Few-Shot, Fine-Grained Sports Action Recognition
This leads to poor accuracy when downstream tasks, such as action recognition, depend on pose.
Towards Tokenized Human Dynamics Representation
For human action understanding, a popular research direction is to analyze short video clips with unambiguous semantic content, such as jumping and drinking.
Domain Knowledge-Informed Self-Supervised Representations for Workout Form Assessment
To that end, we propose to learn exercise-oriented image and video representations from unlabeled samples such that a small dataset annotated by experts suffices for supervised error detection.
Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos
The generated text prompts are paired with corresponding video clips, and together co-train the text encoder and the video encoder via a contrastive approach.
Tragedy Plus Time: Capturing Unintended Human Activities from Weakly-labeled Videos
In videos that contain actions performed unintentionally, agents do not achieve their desired goals.
Action Quality Assessment with Temporal Parsing Transformer
Action Quality Assessment(AQA) is important for action understanding and resolving the task poses unique challenges due to subtle visual differences.
Weakly-Supervised Temporal Action Detection for Fine-Grained Videos with Hierarchical Atomic Actions
Action understanding has evolved into the era of fine granularity, as most human behaviors in real life have only minor differences.
Paxion: Patching Action Knowledge in Video-Language Foundation Models
Action knowledge involves the understanding of textual, visual, and temporal aspects of actions.
Memory-and-Anticipation Transformer for Online Action Understanding
Based on this idea, we present Memory-and-Anticipation Transformer (MAT), a memory-anticipation-based approach, to address the online action detection and anticipation tasks.