Action Segmentation
72 papers with code • 9 benchmarks • 16 datasets
Action Segmentation is a challenging problem in high-level video understanding. In its simplest form, Action Segmentation aims to segment a temporally untrimmed video by time and label each segmented part with one of pre-defined action labels. The results of Action Segmentation can be further used as input to various applications, such as video-to-text and action localization.
Source: TricorNet: A Hybrid Temporal Convolutional and Recurrent Network for Video Action Segmentation
Libraries
Use these libraries to find Action Segmentation models and implementationsDatasets
Subtasks
Latest papers with no code
SMC-NCA: Semantic-guided Multi-level Contrast for Semi-supervised Temporal Action Segmentation
However, learning the representation of each frame by unsupervised contrastive learning for action segmentation remains an open and challenging problem.
X4D-SceneFormer: Enhanced Scene Understanding on 4D Point Cloud Videos through Cross-modal Knowledge Transfer
The field of 4D point cloud understanding is rapidly developing with the goal of analyzing dynamic 3D point cloud sequences.
AdaFocus: Towards End-to-end Weakly Supervised Learning for Long-Video Action Understanding
Under the weak supervision setting, action labels are provided for the whole video without precise start and end times of the action clip.
CASR: Refining Action Segmentation via Marginalizing Frame-levle Causal Relationships
CASR works out by reducing the difference in the causal adjacency matrix between we constructed and pre-segmentation results of backbone models.
NSM4D: Neural Scene Model Based Online 4D Point Cloud Sequence Understanding
We integrate NSM4D with state-of-the-art 4D perception backbones, demonstrating significant improvements on various online perception benchmarks in indoor and outdoor settings.
Action Segmentation Using 2D Skeleton Heatmaps and Multi-Modality Fusion
This paper presents a 2D skeleton-based action segmentation method with applications in fine-grained human activity recognition.
Prompt-enhanced Hierarchical Transformer Elevating Cardiopulmonary Resuscitation Instruction via Temporal Action Segmentation
The vast majority of people who suffer unexpected cardiac arrest are performed cardiopulmonary resuscitation (CPR) by passersby in a desperate attempt to restore life, but endeavors turn out to be fruitless on account of disqualification.
LAC: Latent Action Composition for Skeleton-based Action Segmentation
In this context, we propose Latent Action Composition (LAC), a novel self-supervised framework aiming at learning from synthesized composable motions for skeleton-based action segmentation.
BIT: Bi-Level Temporal Modeling for Efficient Supervised Action Segmentation
We address the task of supervised action segmentation which aims to partition a video into non-overlapping segments, each representing a different action.
DPMix: Mixture of Depth and Point Cloud Video Experts for 4D Action Segmentation
The proposed method, named Mixture of Depth and Point cloud video experts (DPMix), achieved the first place in the 4D Action Segmentation Track of the HOI4D Challenge 2023.