no code implementations • 16 Jul 2022 • Madeline C. Schiappa, Yogesh S. Rawat
In this work, we focus on generating graphical representations of noisy, instructional videos for video understanding.
1 code implementation • 5 Jul 2022 • Madeline C. Schiappa, Shruti Vyas, Hamid Palangi, Yogesh S. Rawat, Vibhav Vineet
Joint visual and language modeling on large-scale datasets has recently shown good progress in multi-modal tasks when compared to single modal learning.
1 code implementation • 18 Jun 2022 • Madeline C. Schiappa, Yogesh S. Rawat, Mubarak Shah
In this survey, we provide a review of existing approaches on self-supervised learning focusing on the video domain.