Search Results for author: Zihan Song

Found 3 papers, 1 papers with code

Grounding-Prompter: Prompting LLM with Multimodal Information for Temporal Sentence Grounding in Long Videos

no code implementations28 Dec 2023 Houlun Chen, Xin Wang, Hong Chen, Zihan Song, Jia Jia, Wenwu Zhu

To tackle these challenges, in this work we propose a Grounding-Prompter method, which is capable of conducting TSG in long videos through prompting LLM with multimodal information.

Denoising In-Context Learning +3

LLM4VG: Large Language Models Evaluation for Video Grounding

no code implementations21 Dec 2023 Wei Feng, Xin Wang, Hong Chen, Zeyang Zhang, Zihan Song, Yuwei Zhou, Wenwu Zhu

Recently, researchers have attempted to investigate the capability of LLMs in handling videos and proposed several video LLM models.

Image Captioning Video Grounding +1

VTimeLLM: Empower LLM to Grasp Video Moments

1 code implementation30 Nov 2023 Bin Huang, Xin Wang, Hong Chen, Zihan Song, Wenwu Zhu

Large language models (LLMs) have shown remarkable text understanding capabilities, which have been extended as Video LLMs to handle video data for comprehending visual details.

Dense Video Captioning Video-based Generative Performance Benchmarking (Consistency) +5

Cannot find the paper you are looking for? You can Submit a new open access paper.