Speech Emotion Recognition

98 papers with code • 14 benchmarks • 18 datasets

Speech Emotion Recognition is a task of speech processing and computational paralinguistics that aims to recognize and categorize the emotions expressed in spoken language. The goal is to determine the emotional state of a speaker, such as happiness, anger, sadness, or frustration, from their speech patterns, such as prosody, pitch, and rhythm.

For multimodal emotion recognition, please upload your result to Multimodal Emotion Recognition on IEMOCAP

Benchmarks

Add a Result

These leaderboards are used to track progress in Speech Emotion Recognition

Dataset	Best Model	Compare
IEMOCAP	DANN	See all
CREMA-D	ConformerXL-P	See all
RAVDESS	VQ-MAE-S-12 (Frame) + Query2Emo	See all
MSP-Podcast (Valence)	w2v2-L-robust-12	See all
MSP-Podcast (Activation)	w2v2-L-robust-12	See all
MSP-Podcast (Dominance)	w2v2-L-robust-12	See all
ShEMO	CNN (1D)	See all
EmoDB Dataset	VQ-MAE-S-12 (Frame) + Query2Emo	See all
Dusha Crowd	Dusha baseline	See all
Dusha Podcast	Dusha baseline	See all
LSSED	PyResNet	See all
EMODB	VGG-optiVMD	See all
Quechua-SER	LSTM	See all
MSP-IMPROV	emoDARTS	See all

Show all 14 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Speech Emotion Recognition models and implementations

raulsteleac/Speech_Emotion_Recognit…

3 papers

alibaba-damo-academy/FunASR

2 papers

3,284

aris-ai/Audio-and-text-based-emotio…

2 papers

138

Datasets

Subtasks

Latest papers

Most implemented Social Latest No code

nEMO: Dataset of Emotional Speech in Polish

faceonlive/ai-research • 9 Apr 2024

Speech emotion recognition has become increasingly important in recent years due to its potential applications in healthcare, customer service, and personalization of dialogue systems.

152

09 Apr 2024

Paper
Code

Unlocking the Emotional States of High-Risk Suicide Callers through Speech Analysis

alaaNfissi/Unlocking-the-Emotional-States-of-High-Risk-Suicide-Callers-through-Speech-Analysis • • IEEE 18th International Conference on Semantic Computing (ICSC) 2024

In light of these challenges, we present a novel end-to-end (E2E) method for speech emotion recognition (SER) as a mean of detecting changes in emotional state, that may indicate a high risk of suicide.

22 Mar 2024

Paper
Code

Iterative Feature Boosting for Explainable Speech Emotion Recognition

alaaNfissi/Iterative-Feature-Boosting-for-Explainable-Speech-Emotion-Recognition • International Conference on Machine Learning and Applications (ICMLA) 2024

In speech emotion recognition (SER), using pre- defined features without considering their practical importance may lead to high dimensional datasets, including redundant and irrelevant information.

19 Mar 2024

Paper
Code

Speech Emotion Recognition Via CNN-Transforemr and Multidimensional Attention Mechanism

scnu-rislab/cnn-transforemr-and-multidimensional-attention-mechanism • 7 Mar 2024

In this paper, to model local and global information at different levels of granularity in speech and capture temporal, spatial and channel dependencies in speech signals, we propose a Speech Emotion Recognition network based on CNN-Transformer and multi-dimensional attention mechanisms.

07 Mar 2024

Paper
Code

EMO-SUPERB: An In-depth Look at Speech Emotion Recognition

EMOsuperb/EMO-SUPERB-submission • • 20 Feb 2024

Speech emotion recognition (SER) is a pivotal technology for human-computer interaction systems.

20 Feb 2024

Paper
Code

Filter-based multi-task cross-corpus feature learning for speech emotion recognition

BakhtiariB/MTCCFLset • Signal, Image and Video Processing 2024

In investigating the effectiveness of its proposed method, the present research experiments on eight well-known public speech emotion corpora and compares the results with eight of the best approaches in the literature.

20 Feb 2024

Paper
Code

Frame-level emotional state alignment method for speech emotion recognition

asolitaryman/hflea • • 27 Dec 2023

To address this problem, we propose a frame-level emotional state alignment method for SER.

27 Dec 2023

Paper
Code

emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

alibaba-damo-academy/FunASR • • 23 Dec 2023

To the best of our knowledge, emotion2vec is the first universal representation model in various emotion-related tasks, filling a gap in the field.

3,284

23 Dec 2023

Paper
Code

Leveraged Mel spectrograms using Harmonic and Percussive Components in Speech Emotion Recognition

DavidHason/ser • • Pacific-Asia Conference on Knowledge Discovery and Data Mining 2022

We attempt to leverage the Mel spectrogram by decomposing distinguishable acoustic features for exploitation in our proposed architecture, which includes a novel feature map generator algorithm, a CNN-based network feature extractor and a multi-layer perceptron (MLP) classifier.

18 Dec 2023

Paper
Code

An Extended Variational Mode Decomposition Algorithm Developed Speech Emotion Recognition Performance

DavidHason/VGG-optiVMD • • Pacific-Asia Conference on Knowledge Discovery and Data Mining 2023

Emotion recognition (ER) from speech signals is a robust approach since it cannot be imitated like facial expression or text based sentiment analysis.

18 Dec 2023

Paper
Code

Speech Emotion Recognition

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result