1 code implementation • 29 Jan 2024 • Qingpei Guo, Furong Xu, Hanxiao Zhang, Wang Ren, Ziping Ma, Lin Ju, Jian Wang, Jingdong Chen, Ming Yang
Vision-language foundation models like CLIP have revolutionized the field of artificial intelligence.
Ranked #1 on Zero-shot Image Retrieval on Flickr30k-CN (using extra training data)
Zero-Shot Cross-Modal Retrieval Zero-shot Image Retrieval +3
no code implementations • 4 Jan 2024 • Ziping Ma, Furong Xu, Jian Liu, Ming Yang, Qingpei Guo
To achieve multimodal alignment from both global and local perspectives, this paper proposes Symmetrizing Contrastive Captioners (SyCoCa), which introduces bidirectional interactions on images and texts across the global and local representation levels.
1 code implementation • CVPR 2022 • Xiyao Liu, Ziping Ma, Junxing Ma, Jian Zhang, Gerald Schaefer, Hui Fang
Conventional steganography approaches embed a secret message into a carrier for concealed communication but are prone to attack by recent advanced steganalysis tools.