1 code implementation • 4 Apr 2024 • Kirolos Ataallah, Xiaoqian Shen, Eslam Abdelrahman, Essam Sleiman, Deyao Zhu, Jian Ding, Mohamed Elhoseiny
This paper introduces MiniGPT4-Video, a multimodal Large Language Model (LLM) designed specifically for video understanding.
Ranked #3 on Zero-Shot Video Question Answer on TVQA