Search Results for author: Minh Quan Do

Found 2 papers, 0 papers with code

Vamos: Versatile Action Models for Video Understanding

no code implementations22 Nov 2023 Shijie Wang, Qi Zhao, Minh Quan Do, Nakul Agarwal, Kwonjoon Lee, Chen Sun

What makes good video representations for video understanding, such as anticipating future activities, or answering video-conditioned questions?

Language Modelling Large Language Model +2

AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?

no code implementations31 Jul 2023 Qi Zhao, Shijie Wang, Ce Zhang, Changcheng Fu, Minh Quan Do, Nakul Agarwal, Kwonjoon Lee, Chen Sun

We propose to formulate the LTA task from two perspectives: a bottom-up approach that predicts the next actions autoregressively by modeling temporal dynamics; and a top-down approach that infers the goal of the actor and plans the needed procedure to accomplish the goal.

Action Anticipation counterfactual +1

Cannot find the paper you are looking for? You can Submit a new open access paper.