no code implementations • 5 Jun 2021 • Yu-Lin Huang, Bo-Hao Su, Y. -W. Peter Hong, Chi-Chun Lee
Specifically, we propose a layered-representation variational autoencoder (LR-VAE), which factorizes speech representation into attribute-sensitive nodes, to derive an identity-free representation for speech emotion recognition (SER), and an emotionless representation for speaker verification (SV).