no code implementations • 9 Mar 2024 • Hexin Liu, Xiangyu Zhang, Leibny Paola Garcia, Andy W. H. Khong, Eng Siong Chng, Shinji Watanabe
Performance evaluation using large language models reveals the advantage of the linguistic hint by achieving 14. 1% and 5. 5% relative improvement on test sets of the ASRU and SEAME datasets, respectively.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 16 Feb 2024 • Xiangyu Zhang, Daijiao Liu, Hexin Liu, Qiquan Zhang, Hanyu Meng, Leibny Paola Garcia, Eng Siong Chng, Lina Yao
Recently, Denoising Diffusion Probabilistic Models (DDPMs) have attained leading performances across a diverse range of generative tasks.
1 code implementation • 27 Nov 2023 • Shuyue Stella Li, Beining Xu, Xiangyu Zhang, Hexin Liu, WenHan Chao, Leibny Paola Garcia
There is a positive correlation between PSR scores and ASR performance, suggesting that phonetic information extracted by monolingual SSL models can be used for downstream tasks in cross-lingual settings.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 29 Sep 2023 • Hexin Liu, Leibny Paola Garcia, Xiangyu Zhang, Andy W. H. Khong, Sanjeev Khudanpur
Languages usually switch within a multilingual speech signal, especially in a bilingual society.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 26 Sep 2023 • Ruixing Liang, Xiangyu Zhang, Qiong Li, Lai Wei, Hexin Liu, Avisha Kumar, Kelley M. Kempski Leadingham, Joshua Punnoose, Leibny Paola Garcia, Amir Manbachi
While significant advancements in artificial intelligence (AI) have catalyzed progress across various domains, its full potential in understanding visual perception remains underexplored.
no code implementations • 1 Jun 2023 • Dongji Gao, Matthew Wiesner, Hainan Xu, Leibny Paola Garcia, Daniel Povey, Sanjeev Khudanpur
Imperfectly transcribed speech is a prevalent issue in human-annotated speech corpora, which degrades the performance of ASR models.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • 30 Nov 2022 • Dongji Gao, Jiatong Shi, Shun-Po Chuang, Leibny Paola Garcia, Hung-Yi Lee, Shinji Watanabe, Sanjeev Khudanpur
This paper describes the ESPnet Unsupervised ASR Open-source Toolkit (EURO), an end-to-end open-source toolkit for unsupervised automatic speech recognition (UASR).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • 26 Oct 2022 • Hexin Liu, HaiHua Xu, Leibny Paola Garcia, Andy W. H. Khong, Yi He, Sanjeev Khudanpur
The comparison of the proposed methods indicates that incorporating language information is more effective than disentangling for reducing language confusion in CS speech.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 21 Oct 2022 • Yu Xuan, Xiangyu Zhang, Shuyue Stella Li, Zihan Shen, Xin Xie, Leibny Paola Garcia, Roberto Togneri
Compared with the state-of-the-art MSF-ANC method, CRLS shows improved performance.
no code implementations • 6 Oct 2022 • Shuyue Stella Li, Xiangyu Zhang, Shu Zhou, Hongchao Shu, Ruixing Liang, Hexin Liu, Leibny Paola Garcia
In this work, we propose a highly Portable Quantum Language Model (PQLM) that can easily transmit information to downstream tasks on classical machines.
no code implementations • 26 Sep 2022 • Xiangyu Zhang, Shuyue Stella Li, Zhanhong He, Roberto Togneri, Leibny Paola Garcia
Lyrics recognition is an important task in music processing.