Search Results for author: Yuzi Yan

Found 6 papers, 1 papers with code

Distributed Policy Gradient for Linear Quadratic Networked Control with Limited Communication Range

no code implementations5 Mar 2024 Yuzi Yan, Yuan Shen

This paper proposes a scalable distributed policy gradient method and proves its convergence to near-optimal solution in multi-agent linear quadratic networked systems.

Mixed-Phoneme BERT: Improving BERT with Mixed Phoneme and Sup-Phoneme Representations for Text to Speech

no code implementations31 Mar 2022 Guangyan Zhang, Kaitao Song, Xu Tan, Daxin Tan, Yuzi Yan, Yanqing Liu, Gang Wang, Wei Zhou, Tao Qin, Tan Lee, Sheng Zhao

However, the works apply pre-training with character-based units to enhance the TTS phoneme encoder, which is inconsistent with the TTS fine-tuning that takes phonemes as input.

Full Attention Bidirectional Deep Learning Structure for Single Channel Speech Enhancement

no code implementations27 Aug 2021 Yuzi Yan, Wei-Qiang Zhang, Michael T. Johnson

As the cornerstone of other important technologies, such as speech recognition and speech synthesis, speech enhancement is a critical area in audio signal processing.

Audio Signal Processing Speech Enhancement +3

AdaSpeech 3: Adaptive Text to Speech for Spontaneous Style

no code implementations6 Jul 2021 Yuzi Yan, Xu Tan, Bohan Li, Guangyan Zhang, Tao Qin, Sheng Zhao, Yuan Shen, Wei-Qiang Zhang, Tie-Yan Liu

While recent text to speech (TTS) models perform very well in synthesizing reading-style (e. g., audiobook) speech, it is still challenging to synthesize spontaneous-style speech (e. g., podcast or conversation), mainly because of two reasons: 1) the lack of training data for spontaneous speech; 2) the difficulty in modeling the filled pauses (um and uh) and diverse rhythms in spontaneous speech.

AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data

1 code implementation20 Apr 2021 Yuzi Yan, Xu Tan, Bohan Li, Tao Qin, Sheng Zhao, Yuan Shen, Tie-Yan Liu

In adaptation, we use untranscribed speech data for speech reconstruction and only fine-tune the TTS decoder.

Cannot find the paper you are looking for? You can Submit a new open access paper.