1 code implementation • 24 Feb 2021 • Jesús Andrés Portillo-Quintero, José Carlos Ortiz-Bayliss, Hugo Terashima-Marín
In this work, we explore the application of the language-image model, CLIP, to obtain video representations without the need for said annotations.
Ranked #21 on Video Retrieval on MSVD