no code implementations • 12 Nov 2022 • Kaicheng Yang, Ruxuan Zhang, Hua Xu, Kai Gao
In this paper, a Self-Adjusting Fusion Representation Learning Model (SA-FRLM) is proposed to learn robust crossmodal fusion representations directly from the unaligned text and audio sequences.