Search Results for author: Rohan Myer Krishnan

Found 1 papers, 0 papers with code

Spacewalk-18: A Benchmark for Multimodal and Long-form Procedural Video Understanding

no code implementations • 30 Nov 2023 • Rohan Myer Krishnan, Zitian Tang, Zhiqiu Yu, Chen Sun

To do this, video-language models must be able to obtain structured understandings, such as the temporal segmentation of a demonstration into sequences of actions and skills, and to generalize the understandings to novel domains.

Video Retrieval Video Understanding

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.