1 code implementation • 7 Nov 2023 • Bo Li, Peiyuan Zhang, Jingkang Yang, Yuanhan Zhang, Fanyi Pu, Ziwei Liu
In this paper, we present OtterHD-8B, an innovative multimodal model evolved from Fuyu-8B, specifically engineered to interpret high-resolution visual inputs with granular precision.
Ranked #85 on Visual Question Answering on MM-Vet
2 code implementations • 8 Jun 2023 • Bo Li, Yuanhan Zhang, Liangyu Chen, Jinghao Wang, Fanyi Pu, Jingkang Yang, Chunyuan Li, Ziwei Liu
We release the MIMIC-IT dataset, instruction-response collection pipeline, benchmarks, and the Otter model.
Ranked #87 on Visual Question Answering on MM-Vet