no code implementations • 14 Mar 2024 • Yu-Chu Yu, Chi-Pin Huang, Jr-Jen Chen, Kai-Po Chang, Yung-Hsuan Lai, Fu-En Yang, Yu-Chiang Frank Wang
Large-scale vision-language models (VLMs) have shown a strong zero-shot generalization capability on unseen-domain data.