Test-of-Time (Test of Time Synthetic Video Dataset)

Introduced by Bagad et al. in Test of Time: Instilling Video-Language Models with a Sense of Time

The goal of this dataset is to probe video-language models for understanding of simple temporal relations like "before" and "after". The dataset is only meant to be an evaluation set and not a training set.

Contents: 1. The dataset has synthetic videos which consists of a pair of shapes appearing gradually. For example, video for the caption "a red circle appears after a yellow circle" will first show a "yellow circle" appear and then a "red circle" appear. The model has to determine the right caption in comparison with a distractor caption "a yellow circle appears after a red circle". Note that this distractor caption has the same set of words but in a different order, motivated by the Winograd schema. 2. The dataset also has a control set in which videos only have a single event, e.g., "a red circle appears". Note that this is a control task to ensure that these videos are not out-of-distribution for a given video model. A time-aware model shall perform perfectly well on both sets. A space-aware model that is not time-aware shall perform poorly on the temporal task while performing perfectly on the control task.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages