Self-supervised visual pretraining has shown significant progress recently.
REPRESENTATION LEARNING SPEECH EMOTION RECOGNITION SPEECH RECOGNITION
Speech emotion recognition is a challenging task, and extensive reliance has been placed on models that use audio features in building well-performing classifiers.
EMOTION CLASSIFICATION MULTIMODAL EMOTION RECOGNITION MULTIMODAL SENTIMENT ANALYSIS SPEECH EMOTION RECOGNITION
In this work, we adopt a feature-engineering based approach to tackle the task of speech emotion recognition.
Ranked #2 on
Speech Emotion Recognition
on IEMOCAP
FEATURE ENGINEERING MULTI-CLASS CLASSIFICATION MULTIMODAL EMOTION RECOGNITION SPEECH EMOTION RECOGNITION
In this paper, we present a database of emotional speech intended to be open-sourced and used for synthesis and generation purpose.
SPEECH EMOTION RECOGNITION SPEECH SYNTHESIS TEXT-TO-SPEECH SYNTHESIS
Emotional voice conversion aims to transform emotional prosody in speech while preserving the linguistic content and speaker identity.
Multimodal emotion recognition from speech is an important area in affective computing.
MULTIMODAL DEEP LEARNING MULTIMODAL EMOTION RECOGNITION MULTIMODAL SENTIMENT ANALYSIS SELF-SUPERVISED LEARNING SPEECH EMOTION RECOGNITION TRANSFER LEARNING
Cross-lingual speech emotion recognition is an important task for practical applications.
In this work, we propose an interaction-aware attention network (IAAN) that incorporate contextual information in the learned vocal representation through a novel attention mechanism.
In this work, we explore the impact of visual modality in addition to speech and text for improving the accuracy of the emotion detection system.
EMOTION CLASSIFICATION MULTIMODAL EMOTION RECOGNITION SPEECH EMOTION RECOGNITION
The field of Text-to-Speech has experienced huge improvements last years benefiting from deep learning techniques.
EMOTIONAL SPEECH SYNTHESIS EXPRESSIVE SPEECH SYNTHESIS LATENT VARIABLE MODELS LEARNING NETWORK REPRESENTATIONS SPEECH EMOTION RECOGNITION TEXT-TO-SPEECH SYNTHESIS