Search Results for author: Disong Wang

Found 8 papers, 2 papers with code

UNIT-DSR: Dysarthric Speech Reconstruction System Using Speech Unit Normalization

no code implementations • 26 Jan 2024 • Yuejiao Wang, Xixin Wu, Disong Wang, Lingwei Meng, Helen Meng

Dysarthric speech reconstruction (DSR) systems aim to automatically convert dysarthric speech into normal-sounding speech.

Domain Adaptation Generative Adversarial Network +1

Paper
Add Code

Disentangled Speech Representation Learning for One-Shot Cross-lingual Voice Conversion Using $β$-VAE

no code implementations • 25 Oct 2022 • Hui Lu, Disong Wang, Xixin Wu, Zhiyong Wu, Xunying Liu, Helen Meng

We propose an unsupervised learning method to disentangle speech into content representation and speaker identity representation.

Disentanglement Voice Conversion

Paper
Add Code

VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion

no code implementations • 18 Feb 2022 • Disong Wang, Shan Yang, Dan Su, Xunying Liu, Dong Yu, Helen Meng

Though significant progress has been made for speaker-dependent Video-to-Speech (VTS) synthesis, little attention is devoted to multi-speaker VTS that can map silent video to speech, while allowing flexible control of speaker identity, all in a single system.

Quantization Speech Synthesis +2

Paper
Add Code

Speaker Identity Preservation in Dysarthric Speech Reconstruction by Adversarial Speaker Adaptation

no code implementations • 18 Feb 2022 • Disong Wang, Songxiang Liu, Xixin Wu, Hui Lu, Lifa Sun, Xunying Liu, Helen Meng

The primary task of ASA fine-tunes the SE with the speech of the target dysarthric speaker to effectively capture identity-related information, and the secondary task applies adversarial training to avoid the incorporation of abnormal speaking patterns into the reconstructed speech, by regularizing the distribution of reconstructed speech to be close to that of reference speech with high quality.

Multi-Task Learning Speaker Verification

Paper
Add Code

VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion

1 code implementation • 18 Jun 2021 • Disong Wang, Liqun Deng, Yu Ting Yeung, Xiao Chen, Xunying Liu, Helen Meng

One-shot voice conversion (VC), which performs conversion across arbitrary speakers with only a single target-speaker utterance for reference, can be effectively achieved by speech representation disentanglement.

Disentanglement Quantization +1

318

Paper
Code

Unsupervised Domain Adaptation for Dysarthric Speech Detection via Domain Adversarial Training and Mutual Information Minimization

no code implementations • 18 Jun 2021 • Disong Wang, Liqun Deng, Yu Ting Yeung, Xiao Chen, Xunying Liu, Helen Meng

Such systems are particularly susceptible to domain mismatch where the training and testing data come from the source and target domains respectively, but the two domains may differ in terms of speech stimuli, disease etiology, etc.

Multi-Task Learning Unsupervised Domain Adaptation

Paper
Add Code

Learning Explicit Prosody Models and Deep Speaker Embeddings for Atypical Voice Conversion

no code implementations • 3 Nov 2020 • Disong Wang, Songxiang Liu, Lifa Sun, Xixin Wu, Xunying Liu, Helen Meng

Third, a conversion model takes phoneme embeddings and typical prosody features as inputs to generate the converted speech, conditioned on the target DSE that is learned via speaker encoder or speaker adaptation.

speech-recognition Speech Recognition +1

Paper
Add Code

Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence Modeling

1 code implementation • 6 Sep 2020 • Songxiang Liu, Yuewen Cao, Disong Wang, Xixin Wu, Xunying Liu, Helen Meng

During the training stage, an encoder-decoder-based hybrid connectionist-temporal-classification-attention (CTC-attention) phoneme recognizer is trained, whose encoder has a bottle-neck layer.

feature selection speech-recognition +2

312

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.