Search Results for author: Trevor Ashby

Found 1 papers, 0 papers with code

Vision-Flan: Scaling Human-Labeled Tasks in Visual Instruction Tuning

no code implementations18 Feb 2024 Zhiyang Xu, Chao Feng, Rulin Shao, Trevor Ashby, Ying Shen, Di Jin, Yu Cheng, Qifan Wang, Lifu Huang

Despite vision-language models' (VLMs) remarkable capabilities as versatile visual assistants, two substantial challenges persist within the existing VLM frameworks: (1) lacking task diversity in pretraining and visual instruction tuning, and (2) annotation error and bias in GPT-4 synthesized instruction tuning data.

Hallucination Visual Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.