no code implementations • 31 Oct 2023 • Yingshu Li, Yunyi Liu, Zhanyu Wang, Xinyu Liang, Lei Wang, Lingqiao Liu, Leyang Cui, Zhaopeng Tu, Longyue Wang, Luping Zhou
This work conducts an evaluation of GPT-4V's multimodal capability for medical image analysis, with a focus on three representative tasks of radiology report generation, medical visual question answering, and medical visual grounding.
no code implementations • 14 Sep 2023 • Yunyi Liu, Craig Jin, David Gunawan
Controlling the variations of sound effects using neural audio synthesis models has been a difficult task.
no code implementations • 4 Apr 2023 • Yunyi Liu, Zhanyu Wang, Dong Xu, Luping Zhou
To bridge this gap, in this paper, we propose a new Transformer based framework for medical VQA (named as Q2ATransformer), which integrates the advantages of both the classification and the generation approaches and provides a unified treatment for the close-end and open-end questions.