Search Results for author: Mingyu Cui

Found 18 papers, 2 papers with code

Cross-Speaker Encoding Network for Multi-Talker Speech Recognition

no code implementations8 Jan 2024 Jiawen Kang, Lingwei Meng, Mingyu Cui, Haohan Guo, Xixin Wu, Xunying Liu, Helen Meng

To the best of our knowledge, this work represents an early effort to integrate SIMO and SISO for multi-talker speech recognition.

speech-recognition Speech Recognition

Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition

no code implementations6 Jul 2023 Guinan Li, Jiajun Deng, Mengzhe Geng, Zengrui Jin, Tianzi Wang, Shujie Hu, Mingyu Cui, Helen Meng, Xunying Liu

Accurate recognition of cocktail party speech containing overlapping speakers, noise and reverberation remains a highly challenging task to date.

Speech Dereverberation Speech Separation

Unified Modeling of Multi-Talker Overlapped Speech Recognition and Diarization with a Sidecar Separator

no code implementations25 May 2023 Lingwei Meng, Jiawen Kang, Mingyu Cui, Haibin Wu, Xixin Wu, Helen Meng

Extending on this, we incorporate a diarization branch into the Sidecar, allowing for unified modeling of both ASR and diarization with a negligible overhead of only 768 parameters.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Use of Speech Impairment Severity for Dysarthric Speech Recognition

no code implementations18 May 2023 Mengzhe Geng, Zengrui Jin, Tianzi Wang, Shujie Hu, Jiajun Deng, Mingyu Cui, Guinan Li, Jianwei Yu, Xurong Xie, Xunying Liu

A key challenge in dysarthric speech recognition is the speaker-level diversity attributed to both speaker-identity associated factors such as gender, and speech impairment severity.

severity prediction speech-recognition +1

Exploring Self-supervised Pre-trained ASR Models For Dysarthric and Elderly Speech Recognition

no code implementations28 Feb 2023 Shujie Hu, Xurong Xie, Zengrui Jin, Mengzhe Geng, Yi Wang, Mingyu Cui, Jiajun Deng, Xunying Liu, Helen Meng

Experiments conducted on the UASpeech dysarthric and DementiaBank Pitt elderly speech corpora suggest TDNN and Conformer ASR systems integrated domain adapted wav2vec2. 0 models consistently outperform the standalone wav2vec2. 0 models by statistically significant WER reductions of 8. 22% and 3. 43% absolute (26. 71% and 15. 88% relative) on the two tasks respectively.

speech-recognition Speech Recognition

A Sidecar Separator Can Convert a Single-Talker Speech Recognition System to a Multi-Talker One

no code implementations20 Feb 2023 Lingwei Meng, Jiawen Kang, Mingyu Cui, Yuejiao Wang, Xixin Wu, Helen Meng

Although automatic speech recognition (ASR) can perform well in common non-overlapping environments, sustaining performance in multi-talker overlapping speech recognition remains challenging.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems

1 code implementation15 Feb 2023 Jiajun Deng, Xurong Xie, Tianzi Wang, Mingyu Cui, Boyang Xue, Zengrui Jin, Guinan Li, Shujie Hu, Xunying Liu

Practical application of unsupervised model-based speaker adaptation techniques to data intensive end-to-end ASR systems is hindered by the scarcity of speaker-level data and performance sensitivity to transcription errors.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Exploiting Cross-domain And Cross-Lingual Ultrasound Tongue Imaging Features For Elderly And Dysarthric Speech Recognition

no code implementations15 Jun 2022 Shujie Hu, Xurong Xie, Mengzhe Geng, Mingyu Cui, Jiajun Deng, Guinan Li, Tianzi Wang, Xunying Liu, Helen Meng

Articulatory features are inherently invariant to acoustic signal distortion and have been successfully incorporated into automatic speech recognition (ASR) systems designed for normal speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Exploiting Cross Domain Acoustic-to-articulatory Inverted Features For Disordered Speech Recognition

no code implementations19 Mar 2022 Shujie Hu, Shansong Liu, Xurong Xie, Mengzhe Geng, Tianzi Wang, Shoukang Hu, Mingyu Cui, Xunying Liu, Helen Meng

Articulatory features are inherently invariant to acoustic signal distortion and have been successfully incorporated into automatic speech recognition (ASR) systems for normal speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Recent Progress in the CUHK Dysarthric Speech Recognition System

no code implementations15 Jan 2022 Shansong Liu, Mengzhe Geng, Shoukang Hu, Xurong Xie, Mingyu Cui, Jianwei Yu, Xunying Liu, Helen Meng

Despite the rapid progress of automatic speech recognition (ASR) technologies in the past few decades, recognition of disordered speech remains a highly challenging task to date.

Audio-Visual Speech Recognition Automatic Speech Recognition +4

TopicRefine: Joint Topic Prediction and Dialogue Response Generation for Multi-turn End-to-End Dialogue System

no code implementations11 Sep 2021 Hongru Wang, Mingyu Cui, Zimo Zhou, Gabriel Pui Cheong Fung, Kam-Fai Wong

A multi-turn dialogue always follows a specific topic thread, and topic shift at the discourse level occurs naturally as the conversation progresses, necessitating the model's ability to capture different topics and generate topic-aware responses.

Response Generation

Neural Architecture Search For LF-MMI Trained Time Delay Neural Networks

no code implementations17 Jul 2020 Shoukang Hu, Xurong Xie, Shansong Liu, Mingyu Cui, Mengzhe Geng, Xunying Liu, Helen Meng

Deep neural networks (DNNs) based automatic speech recognition (ASR) systems are often designed using expert knowledge and empirical evaluation.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Cannot find the paper you are looking for? You can Submit a new open access paper.