no code implementations • LREC 2022 • Abhidip Bhattacharyya, Cecilia Mauceri, Martha Palmer, Christoffer Heckman
As vision processing and natural language processing continue to advance, there is increasing interest in multimodal applications, such as image retrieval, caption generation, and human-robot interaction.