Search Results for author: Fanyi Pu

Found 2 papers, 2 papers with code

OtterHD: A High-Resolution Multi-modality Model

1 code implementation • 7 Nov 2023 • Bo Li, Peiyuan Zhang, Jingkang Yang, Yuanhan Zhang, Fanyi Pu, Ziwei Liu

In this paper, we present OtterHD-8B, an innovative multimodal model evolved from Fuyu-8B, specifically engineered to interpret high-resolution visual inputs with granular precision.

Ranked #85 on Visual Question Answering on MM-Vet

Visual Question Answering

3,449

Paper
Code

MIMIC-IT: Multi-Modal In-Context Instruction Tuning

2 code implementations • 8 Jun 2023 • Bo Li, Yuanhan Zhang, Liangyu Chen, Jinghao Wang, Fanyi Pu, Jingkang Yang, Chunyuan Li, Ziwei Liu

We release the MIMIC-IT dataset, instruction-response collection pipeline, benchmarks, and the Otter model.

Ranked #87 on Visual Question Answering on MM-Vet

In-Context Learning Visual Question Answering

3,449

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.