no code implementations • 15 Nov 2022 • Yuying Xie, Thomas Arildsen, Zheng-Hua Tan
For the prior of speaker identity variable, \acrshort{fhvae} assumes it is a Gaussian distribution with an utterance-scale varying mean and a fixed variance.
1 code implementation • 5 Apr 2022 • Yuying Xie, Thomas Arildsen, Zheng-Hua Tan
This work proposes a complex recurrent VAE framework, specifically in which complex-valued recurrent neural network and L1 reconstruction loss are used.
no code implementations • 5 Apr 2022 • Yuying Xie, Thomas Arildsen, Zheng-Hua Tan
As a self-supervised objective, autoregressive predictive coding (APC), on the other hand, has been used in extracting meaningful and transferable speech features for multiple downstream tasks.