no code implementations • 6 Sep 2023 • Aobo Xia, Shuyu Lei, Yushu Yang, Xiang Guo, Hua Chai
This paper explores the instruction fine-tuning technique for speech-to-semantic tasks by introducing a unified end-to-end (E2E) framework that generates target text conditioned on a task-related prompt for audio data.