Multimodal Emotion Recognition
57 papers with code • 3 benchmarks • 9 datasets
This is a leaderboard for multimodal emotion recognition on the IEMOCAP dataset. The modality abbreviations are A: Acoustic T: Text V: Visual
Please include the modality in the bracket after the model name.
All models must use standard five emotion categories and are evaluated in standard leave-one-session-out (LOSO). See the papers for references.
Libraries
Use these libraries to find Multimodal Emotion Recognition models and implementationsLatest papers
eMotions: A Large-Scale Dataset for Emotion Recognition in Short Videos
The prevailing use of SVs to spread emotions leads to the necessity of emotion recognition in SVs.
Conversation Understanding using Relational Temporal Graph Neural Networks with Auxiliary Cross-Modality Interaction
Emotion recognition is a crucial task for human conversation understanding.
A Transformer-Based Model With Self-Distillation for Multimodal Emotion Recognition in Conversations
Emotion recognition in conversations (ERC), the task of recognizing the emotion of each utterance in a conversation, is crucial for building empathetic machines.
Hypercomplex Multimodal Emotion Recognition from EEG and Peripheral Physiological Signals
Multimodal emotion recognition from physiological signals is receiving an increasing amount of attention due to the impossibility to control them at will unlike behavioral reactions, thus providing more reliable information.
Learning Noise-Robust Joint Representation for Multimodal Emotion Recognition under Realistic Incomplete Data Scenarios
Multimodal emotion recognition (MER) in practical scenarios presents a significant challenge due to the presence of incomplete data, such as missing or noisy data.
CFN-ESA: A Cross-Modal Fusion Network with Emotion-Shift Awareness for Dialogue Emotion Recognition
RUME is applied to extract conversation-level contextual emotional cues while pulling together data distributions between modalities; ACME is utilized to perform multimodal interaction centered on textual modality; LESM is used to model emotion shift and capture emotion-shift information, thereby guiding the learning of the main task.
MER 2023: Multi-label Learning, Modality Robustness, and Semi-Supervised Learning
The first Multimodal Emotion Recognition Challenge (MER 2023) was successfully held at ACM Multimedia.
Decoupled Multimodal Distilling for Emotion Recognition
Specially, the representation of each modality is decoupled into two parts, i. e., modality-irrelevant/-exclusive spaces, in a self-regression manner.
Multimodal Information Bottleneck: Learning Minimal Sufficient Unimodal and Multimodal Representations
To this end, we introduce the multimodal information bottleneck (MIB), aiming to learn a powerful and sufficient multimodal representation that is free of redundancy and to filter out noisy information in unimodal representations.
Exploiting modality-invariant feature for robust multimodal emotion recognition with missing modalities
Multimodal emotion recognition leverages complementary information across modalities to gain performance.