1 code implementation • 25 Jan 2024 • Sanket Rajan Gupte, Josiah Aklilu, Jeffrey J. Nirschl, Serena Yeung-Levy
Foundation vision or vision-language models are trained on large unlabeled or noisy data and learn robust representations that can achieve impressive zero- or few-shot performance on diverse tasks.