1 code implementation • 25 Feb 2024 • Minsu Kim, Jee-weon Jung, Hyeongseop Rha, Soumi Maiti, Siddhant Arora, Xuankai Chang, Shinji Watanabe, Yong Man Ro
We propose a novel Tri-Modal Translation (TMT) model that translates between arbitrary modalities spanning speech, image, and text.