multilingual cross-modal retrieval

2 papers with code • 0 benchmarks • 0 datasets

The task of multilingual cross-modal retrieval contains image-text retrieval tasks on different languages.

Most implemented papers

mCLIP: Multilingual CLIP via Cross-lingual Transfer

ghchen18/acl23_mclip ACL 2023

Furthermore, to enhance the token- and sentence-level multilingual representation of the MTE, we propose to train it with machine translation and contrastive learning jointly before the TriKD to provide a better initialization.

PaLI-3 Vision Language Models: Smaller, Faster, Stronger

kyegomez/PALI3 13 Oct 2023

This paper presents PaLI-3, a smaller, faster, and stronger vision language model (VLM) that compares favorably to similar models that are 10x larger.