Speech Emotion Recognition

100 papers with code • 14 benchmarks • 18 datasets

Speech Emotion Recognition is a task of speech processing and computational paralinguistics that aims to recognize and categorize the emotions expressed in spoken language. The goal is to determine the emotional state of a speaker, such as happiness, anger, sadness, or frustration, from their speech patterns, such as prosody, pitch, and rhythm.

For multimodal emotion recognition, please upload your result to Multimodal Emotion Recognition on IEMOCAP

Benchmarks

Add a Result

These leaderboards are used to track progress in Speech Emotion Recognition

Dataset	Best Model	Compare
IEMOCAP	DANN	See all
CREMA-D	ConformerXL-P	See all
RAVDESS	VQ-MAE-S-12 (Frame) + Query2Emo	See all
MSP-Podcast (Valence)	w2v2-L-robust-12	See all
MSP-Podcast (Activation)	w2v2-L-robust-12	See all
MSP-Podcast (Dominance)	w2v2-L-robust-12	See all
ShEMO	CNN (1D)	See all
EmoDB Dataset	VQ-MAE-S-12 (Frame) + Query2Emo	See all
Dusha Crowd	Dusha baseline	See all
Dusha Podcast	Dusha baseline	See all
LSSED	PyResNet	See all
EMODB	VGG-optiVMD	See all
Quechua-SER	LSTM	See all
MSP-IMPROV	emoDARTS	See all

Show all 14 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Speech Emotion Recognition models and implementations

raulsteleac/Speech_Emotion_Recognit…

3 papers

alibaba-damo-academy/FunASR

2 papers

3,393

aris-ai/Audio-and-text-based-emotio…

2 papers

138

Datasets

Subtasks

Latest papers

Most implemented Social Latest No code

Leveraged Mel spectrograms using Harmonic and Percussive Components in Speech Emotion Recognition

DavidHason/ser • • Pacific-Asia Conference on Knowledge Discovery and Data Mining 2022

We attempt to leverage the Mel spectrogram by decomposing distinguishable acoustic features for exploitation in our proposed architecture, which includes a novel feature map generator algorithm, a CNN-based network feature extractor and a multi-layer perceptron (MLP) classifier.

18 Dec 2023

Paper
Code

An Extended Variational Mode Decomposition Algorithm Developed Speech Emotion Recognition Performance

DavidHason/VGG-optiVMD • • Pacific-Asia Conference on Knowledge Discovery and Data Mining 2023

Emotion recognition (ER) from speech signals is a robust approach since it cannot be imitated like facial expression or text based sentiment analysis.

18 Dec 2023

Paper
Code

Pre-trained Speech Processing Models Contain Human-Like Biases that Propagate to Speech Emotion Recognition

isaaconline/speat • 29 Oct 2023

We compare biases found in pre-trained models to biases in downstream models adapted to the task of Speech Emotion Recognition (SER) and find that in 66 of the 96 tests performed (69%), the group that is more associated with positive valence as indicated by the SpEAT also tends to be predicted as speaking with higher valence by the downstream model.

29 Oct 2023

Paper
Code

LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT

alibaba-damo-academy/funcodec • • 7 Oct 2023

In this paper, we propose LauraGPT, a unified GPT model for audio recognition, understanding, and generation.

280

07 Oct 2023

Paper
Code

Decoding Emotions: A comprehensive Multilingual Study of Speech Models for Speech Emotion Recognition

95anantsingh/decoding-emotions • • 17 Aug 2023

Recent advancements in transformer-based speech representation models have greatly transformed speech processing.

17 Aug 2023

Paper
Code

Do You Remember? Overcoming Catastrophic Forgetting for Fake Audio Detection

cecile-hi/regularized-adaptive-weight-modification • • 7 Aug 2023

The orthogonal weight modification to overcome catastrophic forgetting does not consider the similarity of genuine audio across different datasets.

07 Aug 2023

Paper
Code

Emo-DNA: Emotion Decoupling and Alignment Learning for Cross-Corpus Speech Emotion Recognition

jiaxin-ye/emo-dna • • 4 Aug 2023

On one hand, our contrastive emotion decoupling achieves decoupling learning via a contrastive decoupling loss to strengthen the separability of emotion-relevant features from corpus-specific ones.

04 Aug 2023

Paper
Code

A Change of Heart: Improving Speech Emotion Recognition through Speech-to-Text Modality Conversion

iclr2023achangeofheart/meld-modality-conversion • 21 Jul 2023

Speech Emotion Recognition (SER) is a challenging task.

21 Jul 2023

Paper
Code

Vesper: A Compact and Effective Pretrained Model for Speech Emotion Recognition

happycolor/vesper • • 20 Jul 2023

Although PTMs shed new light on artificial general intelligence, they are constructed with general tasks in mind, and thus, their efficacy for specific tasks can be further improved.

20 Jul 2023

Paper
Code

Cross-Lingual Cross-Age Group Adaptation for Low-Resource Elderly Speech Emotion Recognition

hltchkust/elderly_ser • • 26 Jun 2023

In this work, we analyze the transferability of emotion recognition across three different languages--English, Mandarin Chinese, and Cantonese; and 2 different age groups--adults and the elderly.

26 Jun 2023

Paper
Code

Speech Emotion Recognition

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result