multilingual cross-modal retrieval
2 papers with code • 0 benchmarks • 0 datasets
The task of multilingual cross-modal retrieval contains image-text retrieval tasks on different languages.
Benchmarks
These leaderboards are used to track progress in multilingual cross-modal retrieval
Most implemented papers
mCLIP: Multilingual CLIP via Cross-lingual Transfer
Furthermore, to enhance the token- and sentence-level multilingual representation of the MTE, we propose to train it with machine translation and contrastive learning jointly before the TriKD to provide a better initialization.
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
This paper presents PaLI-3, a smaller, faster, and stronger vision language model (VLM) that compares favorably to similar models that are 10x larger.