2 code implementations • 9 Apr 2024 • Elizaveta Goncharova, Anton Razzhigaev, Matvey Mikhalchuk, Maxim Kurkin, Irina Abdullaeva, Matvey Skripkin, Ivan Oseledets, Denis Dimitrov, Andrey Kuznetsov
We propose an \textit{OmniFusion} model based on a pretrained LLM and adapters for visual modality.
Ranked #40 on Visual Question Answering on MM-Vet