Search Results for author: Xurong Xie

Found 24 papers, 3 papers with code

Towards Automatic Data Augmentation for Disordered Speech Recognition

no code implementations • 14 Dec 2023 • Zengrui Jin, Xurong Xie, Tianzi Wang, Mengzhe Geng, Jiajun Deng, Guinan Li, Shujie Hu, Xunying Liu

Automatic recognition of disordered speech remains a highly challenging task to date due to data scarcity.

Data Augmentation Reinforcement Learning (RL) +2

Paper
Add Code

Factorised Speaker-environment Adaptive Training of Conformer Speech Recognition Systems

no code implementations • 26 Jun 2023 • Jiajun Deng, Guinan Li, Xurong Xie, Zengrui Jin, Mingyu Cui, Tianzi Wang, Shujie Hu, Mengzhe Geng, Xunying Liu

Rich sources of variability in natural speech present significant challenges to current data intensive speech recognition technologies.

speech-recognition Speech Recognition +1

Paper
Add Code

Use of Speech Impairment Severity for Dysarthric Speech Recognition

no code implementations • 18 May 2023 • Mengzhe Geng, Zengrui Jin, Tianzi Wang, Shujie Hu, Jiajun Deng, Mingyu Cui, Guinan Li, Jianwei Yu, Xurong Xie, Xunying Liu

A key challenge in dysarthric speech recognition is the speaker-level diversity attributed to both speaker-identity associated factors such as gender, and speech impairment severity.

severity prediction speech-recognition +1

Paper
Add Code

Exploring Self-supervised Pre-trained ASR Models For Dysarthric and Elderly Speech Recognition

no code implementations • 28 Feb 2023 • Shujie Hu, Xurong Xie, Zengrui Jin, Mengzhe Geng, Yi Wang, Mingyu Cui, Jiajun Deng, Xunying Liu, Helen Meng

Experiments conducted on the UASpeech dysarthric and DementiaBank Pitt elderly speech corpora suggest TDNN and Conformer ASR systems integrated domain adapted wav2vec2. 0 models consistently outperform the standalone wav2vec2. 0 models by statistically significant WER reductions of 8. 22% and 3. 43% absolute (26. 71% and 15. 88% relative) on the two tasks respectively.

speech-recognition Speech Recognition

Paper
Add Code

Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems

1 code implementation • 15 Feb 2023 • Jiajun Deng, Xurong Xie, Tianzi Wang, Mingyu Cui, Boyang Xue, Zengrui Jin, Guinan Li, Shujie Hu, Xunying Liu

Practical application of unsupervised model-based speaker adaptation techniques to data intensive end-to-end ASR systems is hindered by the scarcity of speaker-level data and performance sensitivity to transcription errors.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Code

Unsupervised Model-based speaker adaptation of end-to-end lattice-free MMI model for speech recognition

no code implementations • 17 Nov 2022 • Xurong Xie, Xunying Liu, Hui Chen, Hongan Wang

Modeling the speaker variability is a key challenge for automatic speech recognition (ASR) systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Adversarial Data Augmentation Using VAE-GAN for Disordered Speech Recognition

no code implementations • 3 Nov 2022 • Zengrui Jin, Xurong Xie, Mengzhe Geng, Tianzi Wang, Shujie Hu, Jiajun Deng, Guinan Li, Xunying Liu

After LHUC speaker adaptation, the best system using VAE-GAN based augmentation produced an overall WER of 27. 78% on the UASpeech test set of 16 dysarthric speakers, and the lowest published WER of 57. 31% on the subset of speakers with "Very Low" intelligibility.

Data Augmentation Generative Adversarial Network +2

Paper
Add Code

Confidence Score Based Conformer Speaker Adaptation for Speech Recognition

no code implementations • 24 Jun 2022 • Jiajun Deng, Xurong Xie, Tianzi Wang, Mingyu Cui, Boyang Xue, Zengrui Jin, Mengzhe Geng, Guinan Li, Xunying Liu, Helen Meng

A key challenge for automatic speech recognition (ASR) systems is to model the speaker level variability.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Two-pass Decoding and Cross-adaptation Based System Combination of End-to-end Conformer and Hybrid TDNN ASR Systems

no code implementations • 23 Jun 2022 • Mingyu Cui, Jiajun Deng, Shoukang Hu, Xurong Xie, Tianzi Wang, Shujie Hu, Mengzhe Geng, Boyang Xue, Xunying Liu, Helen Meng

Fundamental modelling differences between hybrid and end-to-end (E2E) automatic speech recognition (ASR) systems create large diversity and complementarity among them.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Exploiting Cross-domain And Cross-Lingual Ultrasound Tongue Imaging Features For Elderly And Dysarthric Speech Recognition

no code implementations • 15 Jun 2022 • Shujie Hu, Xurong Xie, Mengzhe Geng, Mingyu Cui, Jiajun Deng, Guinan Li, Tianzi Wang, Xunying Liu, Helen Meng

Articulatory features are inherently invariant to acoustic signal distortion and have been successfully incorporated into automatic speech recognition (ASR) systems designed for normal speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

On-the-Fly Feature Based Rapid Speaker Adaptation for Dysarthric and Elderly Speech Recognition

no code implementations • 28 Mar 2022 • Mengzhe Geng, Xurong Xie, Rongfeng Su, Jianwei Yu, Zengrui Jin, Tianzi Wang, Shujie Hu, Zi Ye, Helen Meng, Xunying Liu

Accurate recognition of dysarthric and elderly speech remain challenging tasks to date.

speech-recognition Speech Recognition

Paper
Add Code

Exploiting Cross Domain Acoustic-to-articulatory Inverted Features For Disordered Speech Recognition

no code implementations • 19 Mar 2022 • Shujie Hu, Shansong Liu, Xurong Xie, Mengzhe Geng, Tianzi Wang, Shoukang Hu, Mingyu Cui, Xunying Liu, Helen Meng

Articulatory features are inherently invariant to acoustic signal distortion and have been successfully incorporated into automatic speech recognition (ASR) systems for normal speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Speaker Adaptation Using Spectro-Temporal Deep Features for Dysarthric and Elderly Speech Recognition

no code implementations • 21 Feb 2022 • Mengzhe Geng, Xurong Xie, Zi Ye, Tianzi Wang, Guinan Li, Shujie Hu, Xunying Liu, Helen Meng

Motivated by the spectro-temporal level differences between dysarthric, elderly and normal speech that systematically manifest in articulatory imprecision, decreased volume and clarity, slower speaking rates and increased dysfluencies, novel spectrotemporal subspace basis deep embedding features derived using SVD speech spectrum decomposition are proposed in this paper to facilitate auxiliary feature based speaker adaptation of state-of-the-art hybrid DNN/TDNN and end-to-end Conformer speech recognition systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Variational Auto-Encoder Based Variability Encoding for Dysarthric Speech Recognition

no code implementations • 24 Jan 2022 • Xurong Xie, Rukiye Ruzi, Xunying Liu, Lan Wang

Dysarthric speech recognition is a challenging task due to acoustic variability and limited amount of available data.

speech-recognition Speech Recognition

Paper
Add Code

Investigation of Deep Neural Network Acoustic Modelling Approaches for Low Resource Accented Mandarin Speech Recognition

no code implementations • 24 Jan 2022 • Xurong Xie, Xiang Sui, Xunying Liu, Lan Wang

Meanwhile, approaches of multi-accent modelling including multi-style training, multi-accent decision tree state tying, DNN tandem and multi-level adaptive network (MLAN) tandem hidden Markov model (HMM) modelling are combined and compared in this paper.

Acoustic Modelling speech-recognition +1

Paper
Add Code

Recent Progress in the CUHK Dysarthric Speech Recognition System

no code implementations • 15 Jan 2022 • Shansong Liu, Mengzhe Geng, Shoukang Hu, Xurong Xie, Mingyu Cui, Jianwei Yu, Xunying Liu, Helen Meng

Despite the rapid progress of automatic speech recognition (ASR) technologies in the past few decades, recognition of disordered speech remains a highly challenging task to date.

Audio-Visual Speech Recognition Automatic Speech Recognition +4

Paper
Add Code

Spectro-Temporal Deep Features for Disordered Speech Assessment and Recognition

no code implementations • 14 Jan 2022 • Mengzhe Geng, Shansong Liu, Jianwei Yu, Xurong Xie, Shoukang Hu, Zi Ye, Zengrui Jin, Xunying Liu, Helen Meng

Automatic recognition of disordered speech remains a highly challenging task to date.

Data Augmentation speech-recognition +1

Paper
Add Code

Investigation of Data Augmentation Techniques for Disordered Speech Recognition

no code implementations • 14 Jan 2022 • Mengzhe Geng, Xurong Xie, Shansong Liu, Jianwei Yu, Shoukang Hu, Xunying Liu, Helen Meng

This paper investigates a set of data augmentation techniques for disordered speech recognition, including vocal tract length perturbation (VTLP), tempo perturbation and speed perturbation.

Data Augmentation speech-recognition +1

Paper
Add Code

Neural Architecture Search For LF-MMI Trained Time Delay Neural Networks

1 code implementation • 8 Jan 2022 • Shoukang Hu, Xurong Xie, Mingyu Cui, Jiajun Deng, Shansong Liu, Jianwei Yu, Mengzhe Geng, Xunying Liu, Helen Meng

State-of-the-art automatic speech recognition (ASR) system development is data and computation intensive.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Code

A Multi-level Acoustic Feature Extraction Framework for Transformer Based End-to-End Speech Recognition

no code implementations • 18 Aug 2021 • Jin Li, Rongfeng Su, Xurong Xie, Nan Yan, Lan Wang

The shallow stream is used to acquire traditional shallow features that is beneficial for the classification of phones or words while the deep stream is used to obtain utterance-level speaker-invariant deep features for improving the feature diversity.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Adversarial Data Augmentation for Disordered Speech Recognition

no code implementations • 2 Aug 2021 • Zengrui Jin, Mengzhe Geng, Xurong Xie, Jianwei Yu, Shansong Liu, Xunying Liu, Helen Meng

Automatic recognition of disordered speech remains a highly challenging task to date.

Data Augmentation speech-recognition +1

Paper
Add Code

Bayesian Learning for Deep Neural Network Adaptation

1 code implementation • 14 Dec 2020 • Xurong Xie, Xunying Liu, Tan Lee, Lan Wang

A key task for speech recognition systems is to reduce the mismatch between training and evaluation data that is often attributable to speaker differences.

speech-recognition Speech Recognition +1

Paper
Code

Bayesian Learning of LF-MMI Trained Time Delay Neural Networks for Speech Recognition

no code implementations • 8 Dec 2020 • Shoukang Hu, Xurong Xie, Shansong Liu, Jianwei Yu, Zi Ye, Mengzhe Geng, Xunying Liu, Helen Meng

On a third cross domain adaptation task requiring rapidly porting a 1000 hour LibriSpeech data trained system to a small DementiaBank elderly speech corpus, the proposed Bayesian TDNN LF-MMI systems outperformed the baseline system using direct weight fine-tuning by up to 2. 5\% absolute WER reduction.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Neural Architecture Search For LF-MMI Trained Time Delay Neural Networks

no code implementations • 17 Jul 2020 • Shoukang Hu, Xurong Xie, Shansong Liu, Mingyu Cui, Mengzhe Geng, Xunying Liu, Helen Meng

Deep neural networks (DNNs) based automatic speech recognition (ASR) systems are often designed using expert knowledge and empirical evaluation.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.