no code implementations • 8 Jan 2024 • Jiawen Kang, Lingwei Meng, Mingyu Cui, Haohan Guo, Xixin Wu, Xunying Liu, Helen Meng
To the best of our knowledge, this work represents an early effort to integrate SIMO and SISO for multi-talker speech recognition.
no code implementations • 6 Jul 2023 • Guinan Li, Jiajun Deng, Mengzhe Geng, Zengrui Jin, Tianzi Wang, Shujie Hu, Mingyu Cui, Helen Meng, Xunying Liu
Accurate recognition of cocktail party speech containing overlapping speakers, noise and reverberation remains a highly challenging task to date.
no code implementations • 26 Jun 2023 • Jiajun Deng, Guinan Li, Xurong Xie, Zengrui Jin, Mingyu Cui, Tianzi Wang, Shujie Hu, Mengzhe Geng, Xunying Liu
Rich sources of variability in natural speech present significant challenges to current data intensive speech recognition technologies.
no code implementations • 23 Jun 2023 • Mingyu Cui, Jiawen Kang, Jiajun Deng, Xi Yin, Yutao Xie, Xie Chen, Xunying Liu
Current ASR systems are mainly trained and evaluated at the utterance level.
no code implementations • 25 May 2023 • Lingwei Meng, Jiawen Kang, Mingyu Cui, Haibin Wu, Xixin Wu, Helen Meng
Extending on this, we incorporate a diarization branch into the Sidecar, allowing for unified modeling of both ASR and diarization with a negligible overhead of only 768 parameters.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 18 May 2023 • Mengzhe Geng, Zengrui Jin, Tianzi Wang, Shujie Hu, Jiajun Deng, Mingyu Cui, Guinan Li, Jianwei Yu, Xurong Xie, Xunying Liu
A key challenge in dysarthric speech recognition is the speaker-level diversity attributed to both speaker-identity associated factors such as gender, and speech impairment severity.
no code implementations • 28 Feb 2023 • Shujie Hu, Xurong Xie, Zengrui Jin, Mengzhe Geng, Yi Wang, Mingyu Cui, Jiajun Deng, Xunying Liu, Helen Meng
Experiments conducted on the UASpeech dysarthric and DementiaBank Pitt elderly speech corpora suggest TDNN and Conformer ASR systems integrated domain adapted wav2vec2. 0 models consistently outperform the standalone wav2vec2. 0 models by statistically significant WER reductions of 8. 22% and 3. 43% absolute (26. 71% and 15. 88% relative) on the two tasks respectively.
no code implementations • 20 Feb 2023 • Lingwei Meng, Jiawen Kang, Mingyu Cui, Yuejiao Wang, Xixin Wu, Helen Meng
Although automatic speech recognition (ASR) can perform well in common non-overlapping environments, sustaining performance in multi-talker overlapping speech recognition remains challenging.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • 15 Feb 2023 • Jiajun Deng, Xurong Xie, Tianzi Wang, Mingyu Cui, Boyang Xue, Zengrui Jin, Guinan Li, Shujie Hu, Xunying Liu
Practical application of unsupervised model-based speaker adaptation techniques to data intensive end-to-end ASR systems is hindered by the scarcity of speaker-level data and performance sensitivity to transcription errors.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 24 Jun 2022 • Jiajun Deng, Xurong Xie, Tianzi Wang, Mingyu Cui, Boyang Xue, Zengrui Jin, Mengzhe Geng, Guinan Li, Xunying Liu, Helen Meng
A key challenge for automatic speech recognition (ASR) systems is to model the speaker level variability.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 23 Jun 2022 • Tianzi Wang, Jiajun Deng, Mengzhe Geng, Zi Ye, Shoukang Hu, Yi Wang, Mingyu Cui, Zengrui Jin, Xunying Liu, Helen Meng
Early diagnosis of Alzheimer's disease (AD) is crucial in facilitating preventive care to delay further progression.
no code implementations • 23 Jun 2022 • Mingyu Cui, Jiajun Deng, Shoukang Hu, Xurong Xie, Tianzi Wang, Shujie Hu, Mengzhe Geng, Boyang Xue, Xunying Liu, Helen Meng
Fundamental modelling differences between hybrid and end-to-end (E2E) automatic speech recognition (ASR) systems create large diversity and complementarity among them.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 15 Jun 2022 • Shujie Hu, Xurong Xie, Mengzhe Geng, Mingyu Cui, Jiajun Deng, Guinan Li, Tianzi Wang, Xunying Liu, Helen Meng
Articulatory features are inherently invariant to acoustic signal distortion and have been successfully incorporated into automatic speech recognition (ASR) systems designed for normal speech.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 19 Mar 2022 • Shujie Hu, Shansong Liu, Xurong Xie, Mengzhe Geng, Tianzi Wang, Shoukang Hu, Mingyu Cui, Xunying Liu, Helen Meng
Articulatory features are inherently invariant to acoustic signal distortion and have been successfully incorporated into automatic speech recognition (ASR) systems for normal speech.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 15 Jan 2022 • Shansong Liu, Mengzhe Geng, Shoukang Hu, Xurong Xie, Mingyu Cui, Jianwei Yu, Xunying Liu, Helen Meng
Despite the rapid progress of automatic speech recognition (ASR) technologies in the past few decades, recognition of disordered speech remains a highly challenging task to date.
Audio-Visual Speech Recognition Automatic Speech Recognition +4
1 code implementation • 8 Jan 2022 • Shoukang Hu, Xurong Xie, Mingyu Cui, Jiajun Deng, Shansong Liu, Jianwei Yu, Mengzhe Geng, Xunying Liu, Helen Meng
State-of-the-art automatic speech recognition (ASR) system development is data and computation intensive.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 11 Sep 2021 • Hongru Wang, Mingyu Cui, Zimo Zhou, Gabriel Pui Cheong Fung, Kam-Fai Wong
A multi-turn dialogue always follows a specific topic thread, and topic shift at the discourse level occurs naturally as the conversation progresses, necessitating the model's ability to capture different topics and generate topic-aware responses.
no code implementations • 17 Jul 2020 • Shoukang Hu, Xurong Xie, Shansong Liu, Mingyu Cui, Mengzhe Geng, Xunying Liu, Helen Meng
Deep neural networks (DNNs) based automatic speech recognition (ASR) systems are often designed using expert knowledge and empirical evaluation.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2