Clinical Depression and Affect Recognition with EmoAudioNet

1 Nov 2019  ·  Emna Rejaibi, Daoud Kadoch, Kamil Bentounes, Romain Alfred, Mohamed Daoudi, Abdenour Hadid, Alice Othmani ·

Automatic analysis of emotions and affects from speech is an inherently challenging problem with a broad range of applications in Human-Computer Interaction (HCI), health informatics, assistive technologies and multimedia retrieval. Understanding human's specific and basic emotions and reacting accordingly can improve HCI. Besides, giving machines skills to understand human's emotions when interacting with other humans can help humans with a socio-affective intelligence. In this paper, we present a deep Neural Network-based architecture called EmoAudioNet which studies the time-frequency representation of the audio signal and the visual representation of its spectrum of frequencies. Two applications are performed using EmoAudioNet : automatic clinical depression recognition and continuous dimensional emotion recognition from speech. The extensive experiments showed that the proposed approach significantly outperforms the state-of-art approaches on RECOLA and DAIC-WOZ databases. The competitive results call for applying EmoAudioNet on others affects and emotions recognition from speech applications.

PDF Abstract
No code implementations yet. Submit your code now

Categories


Human-Computer Interaction Sound Audio and Speech Processing

Datasets