Search Results for author: Qirui Jiao

Found 1 papers, 0 papers with code

Enhancing Multimodal Large Language Models with Vision Detection Models: An Empirical Study

no code implementations • 31 Jan 2024 • Qirui Jiao, Daoyuan Chen, Yilun Huang, Yaliang Li, Ying Shen

Despite the impressive capabilities of Multimodal Large Language Models (MLLMs) in integrating text and image modalities, challenges remain in accurately interpreting detailed visual elements.

Ranked #41 on Visual Question Answering on MM-Vet

Hallucination object-detection +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.