no code implementations • 21 Mar 2024 • Ahmad Mahmood, Ashmal Vayani, Muzammal Naseer, Salman Khan, Fahad Shahbaz Khan
In contrast, this paper introduces a Video Understanding and Reasoning Framework (VURF) based on the reasoning power of LLMs.
no code implementations • 23 Feb 2023 • Muzammal Naseer, Ahmad Mahmood, Salman Khan, Fahad Khan
Our temporal prompts are the result of a learnable transformation that allows optimizing for temporal gradients during an adversarial attack to fool the motion dynamics.