ActionBench

ActionBench contains two carefully designed probing tasks: Action Antonym and Video Reversal, which targets multimodal alignment capabilities and temporal understanding skills of the model, respectively. Action knowledge involves the understanding of textual, visual, and temporal aspects of actions. The benchmark is constructed by leveraging two existing open-domain video-language datasets, Ego4D and Something-Something v2 (SSv2), which provide fine-grained action annotations for each video clip.

Source: Paxion: Patching Action Knowledge in Video-Language Foundation Models

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages