Search Results for author: Yufei Zhan

Found 3 papers, 2 papers with code

Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring

1 code implementation14 Mar 2024 Yufei Zhan, Yousong Zhu, Hongyin Zhao, Fan Yang, Ming Tang, Jinqiao Wang

Large Vision Language Models have achieved fine-grained object perception, but the limitation of image resolution remains a significant obstacle to surpass the performance of task-specific experts in complex and dense scenarios.

Object Object Counting +3

Mitigating Hallucination in Visual Language Models with Visual Supervision

no code implementations27 Nov 2023 Zhiyang Chen, Yousong Zhu, Yufei Zhan, Zhaowen Li, Chaoyang Zhao, Jinqiao Wang, Ming Tang

Large vision-language models (LVLMs) suffer from hallucination a lot, generating responses that apparently contradict to the image content occasionally.

Hallucination

Griffon: Spelling out All Object Locations at Any Granularity with Large Language Models

1 code implementation24 Nov 2023 Yufei Zhan, Yousong Zhu, Zhiyang Chen, Fan Yang, Ming Tang, Jinqiao Wang

More importantly, we present $\textbf{Griffon}$, a purely LVLM-based baseline, which does not require the introduction of any special tokens, expert models, or additional detection modules.

Referring Expression Referring Expression Comprehension

Cannot find the paper you are looking for? You can Submit a new open access paper.