Search Results for author: Huixuan Zhang

Quantity Matters: Towards Assessing and Mitigating Number Hallucination in Large Vision-Language Models

Large-scale vision-language models have demonstrated impressive skill in handling tasks that involve both areas.

Paper
Add Code

News image captioning requires model to generate an informative caption rich in entities, with the news image and the associated news article.

Paper
Add Code

We create a multimodal detection dataset from Weibo (a Chinese social media) and carry out some studies on it.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.