Speech Emotion Recognition

98 papers with code • 14 benchmarks • 18 datasets

Speech Emotion Recognition is a task of speech processing and computational paralinguistics that aims to recognize and categorize the emotions expressed in spoken language. The goal is to determine the emotional state of a speaker, such as happiness, anger, sadness, or frustration, from their speech patterns, such as prosody, pitch, and rhythm.

For multimodal emotion recognition, please upload your result to Multimodal Emotion Recognition on IEMOCAP

Libraries

Use these libraries to find Speech Emotion Recognition models and implementations

Most implemented papers

SERAB: A multi-lingual benchmark for speech emotion recognition

neclow/serab 7 Oct 2021

To facilitate the process, here, we present the Speech Emotion Recognition Adaptation Benchmark (SERAB), a framework for evaluating the performance and generalization capacity of different approaches for utterance-level SER.

Speech Emotion Diarization: Which Emotion Appears When?

speechbrain/speechbrain 22 Jun 2023

Speech Emotion Recognition (SER) typically relies on utterance-level solutions.

emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

ddlBoJack/emotion2vec 23 Dec 2023

To the best of our knowledge, emotion2vec is the first universal representation model in various emotion-related tasks, filling a gap in the field.

nEMO: Dataset of Emotional Speech in Polish

amu-cai/nemo 9 Apr 2024

Speech emotion recognition has become increasingly important in recent years due to its potential applications in healthcare, customer service, and personalization of dialogue systems.

Transfer Learning for Improving Speech Emotion Classification Accuracy

raulsteleac/Speech_Emotion_Recognition 19 Jan 2018

The majority of existing speech emotion recognition research focuses on automatic emotion detection using training and testing data from same corpus collected under the same conditions.

Attention Based Fully Convolutional Network for Speech Emotion Recognition

aris-ai/Audio-and-text-based-emotion-recognition 5 Jun 2018

In this paper, we present a novel attention based fully convolutional network for speech emotion recognition.

Evaluating Gammatone Frequency Cepstral Coefficients with Neural Networks for Emotion Recognition from Speech

SoyBison/gammatone 23 Jun 2018

Mel Frequency Cepstral Coefficients (MFCCs) are one of the most commonly used representations for audio speech recognition and classification.

The Emotional Voices Database: Towards Controlling the Emotion Dimension in Voice Generation Systems

numediart/EmoV-DB 25 Jun 2018

In this paper, we present a database of emotional speech intended to be open-sourced and used for synthesis and generation purpose.

Integrating Recurrence Dynamics for Speech Emotion Recognition

etzinis/nldrp 9 Nov 2018

We investigate the performance of features that can capture nonlinear recurrence dynamics embedded in the speech signal for the task of Speech Emotion Recognition (SER).

Cross Lingual Speech Emotion Recognition: Urdu vs. Western Languages

siddiquelatif/URDU-Dataset 15 Dec 2018

Cross-lingual speech emotion recognition is an important task for practical applications.