1 code implementation • 7 Feb 2024 • Gilles Baechler, Srinivas Sunkara, Maria Wang, Fedir Zubach, Hassan Mansoor, Vincent Etter, Victor Cărbune, Jason Lin, Jindong Chen, Abhanshu Sharma
At the heart of this mixture is a novel screen annotation task in which the model has to identify the type and location of UI elements.
Ranked #3 on Visual Question Answering (VQA) on InfographicVQA (using extra training data)