Search Results for author: Shengkui Zhao

Found 7 papers, 5 papers with code

SPGM: Prioritizing Local Features for enhanced speech separation performance

1 code implementation22 Sep 2023 Jia Qi Yip, Shengkui Zhao, Yukun Ma, Chongjia Ni, Chong Zhang, Hao Wang, Trung Hieu Nguyen, Kun Zhou, Dianwen Ng, Eng Siong Chng, Bin Ma

Dual-path is a popular architecture for speech separation models (e. g. Sepformer) which splits long sequences into overlapping chunks for its intra- and inter-blocks that separately model intra-chunk local features and inter-chunk global relationships.

Speech Separation

ACA-Net: Towards Lightweight Speaker Verification using Asymmetric Cross Attention

1 code implementation20 May 2023 Jia Qi Yip, Tuan Truong, Dianwen Ng, Chong Zhang, Yukun Ma, Trung Hieu Nguyen, Chongjia Ni, Shengkui Zhao, Eng Siong Chng, Bin Ma

In this paper, we propose ACA-Net, a lightweight, global context-aware speaker embedding extractor for Speaker Verification (SV) that improves upon existing work by using Asymmetric Cross Attention (ACA) to replace temporal pooling.

Speaker Verification

MossFormer: Pushing the Performance Limit of Monaural Speech Separation using Gated Single-Head Transformer with Convolution-Augmented Joint Self-Attentions

1 code implementation23 Feb 2023 Shengkui Zhao, Bin Ma

To effectively solve the indirect elemental interactions across chunks in the dual-path architecture, MossFormer employs a joint local and global self-attention architecture that simultaneously performs a full-computation self-attention on local chunks and a linearised low-cost self-attention over the full sequence.

Speech Separation

Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses

1 code implementation3 Feb 2021 Shengkui Zhao, Trung Hieu Nguyen, Bin Ma

In this paper, we propose a complex convolutional block attention module (CCBAM) to boost the representation power of the complex-valued convolutional layers by constructing more informative features.

Speech Enhancement

Towards Natural and Controllable Cross-Lingual Voice Conversion Based on Neural TTS Model and Phonetic Posteriorgram

no code implementations3 Feb 2021 Shengkui Zhao, Hao Wang, Trung Hieu Nguyen, Bin Ma

Cross-lingual voice conversion (VC) is an important and challenging problem due to significant mismatches of the phonetic set and the speech prosody of different languages.

Voice Conversion

Towards Natural Bilingual and Code-Switched Speech Synthesis Based on Mix of Monolingual Recordings and Cross-Lingual Voice Conversion

1 code implementation16 Oct 2020 Shengkui Zhao, Trung Hieu Nguyen, Hao Wang, Bin Ma

With these data, three neural TTS models -- Tacotron2, Transformer and FastSpeech are applied for building bilingual and code-switched TTS.

Speech Synthesis Voice Conversion

Cannot find the paper you are looking for? You can Submit a new open access paper.