The Dataset of Multimodal Semantic Egocentric Video (DoMSEV) contains 80-hours of multimodal (RGB-D, IMU, and GPS) data related to First-Person Videos with annotations for recorder profile, frame scene, activities, interaction, and attention.
Source: A Weighted Sparse Sampling and Smoothing Frame Transition Approach for Semantic Fast-Forward First-Person VideosPaper | Code | Results | Date | Stars |
---|