Search Results for author: Jianxin Liang

Found 2 papers, 2 papers with code

HawkEye: Training Video-Text LLMs for Grounding Text in Videos

1 code implementation • 15 Mar 2024 • Yueqian Wang, Xiaojun Meng, Jianxin Liang, Yuxuan Wang, Qun Liu, Dongyan Zhao

Video-text Large Language Models (video-text LLMs) have shown remarkable performance in answering questions and holding conversations on simple videos.

Ranked #4 on Video Question Answering on MVBench

Video Grounding Video Question Answering

Paper
Code

LSTP: Language-guided Spatial-Temporal Prompt Learning for Long-form Video-Text Understanding

1 code implementation • 25 Feb 2024 • Yuxuan Wang, Yueqian Wang, Pengfei Wu, Jianxin Liang, Dongyan Zhao, Zilong Zheng

Despite progress in video-language modeling, the computational challenge of interpreting long-form videos in response to task-specific linguistic queries persists, largely due to the complexity of high-dimensional video data and the misalignment between language and visual cues over space and time.

Ranked #8 on Video Question Answering on NExT-QA

Computational Efficiency Language Modelling +3

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.