no code implementations • 21 Nov 2023 • Minghe Gao, Juncheng Li, Hao Fei, Liang Pang, Wei Ji, Guoming Wang, Wenqiao Zhang, Siliang Tang, Yueting Zhuang
Visual programming, a modular and generalizable paradigm, integrates different modules and Python operators to solve various vision-language tasks.
1 code implementation • 4 Oct 2023 • Dong Chen, Kaihang Pan, Guoming Wang, Yueting Zhuang, Siliang Tang
To learn a more compact latent space for the vision anomaly detector, CMLE learns a correlation structure matrix from the language modality, and then the latent space of vision modality will be learned with the guidance of the matrix.