We introduce a novel framework for evaluating multimodal deep learning models with respect to their language understanding and generalization abilities.
Multimodal emotion recognition from speech is an important area in affective computing.
MULTIMODAL DEEP LEARNING MULTIMODAL EMOTION RECOGNITION MULTIMODAL SENTIMENT ANALYSIS SELF-SUPERVISED LEARNING SPEECH EMOTION RECOGNITION TRANSFER LEARNING
Combining complementary information from multiple modalities is intuitively appealing for improving the performance of learning-based approaches.
In particular, we also investigate a special case of multi-modality learning (MML) -- cross-modality learning (CML) that exists widely in RS image classification applications.
The goal of score following is to track a musical performance, usually in the form of audio, in a corresponding score representation.
Our work improves on existing multimodal deep learning algorithms in two essential ways: (1) it presents a novel method for performing cross-modality (before features are learned from individual modalities) and (2) extends the previously proposed cross-connections which only transfer information between streams that process compatible data.
In recent years, natural language descriptions are used to obtain information on discriminative parts of the object.
Ranked #1 on
Multimodal Deep Learning
on CUB-200-2011
DOCUMENT TEXT CLASSIFICATION FINE-GRAINED IMAGE CLASSIFICATION MULTIMODAL TEXT AND IMAGE CLASSIFICATION
Memes on the Internet are often harmless and sometimes amusing.
Ranked #1 on
Meme Classification
on Hateful Memes
(using extra training data)
Emotion Recognition is a challenging research area given its complex nature, and humans express emotional cues across various modalities such as language, facial expressions, and speech.
MULTIMODAL DEEP LEARNING MULTIMODAL EMOTION RECOGNITION MULTIMODAL SENTIMENT ANALYSIS REPRESENTATION LEARNING SELF-SUPERVISED LEARNING
Multimedia content in social media platforms provides significant information during disaster events.
Ranked #1 on
Disaster Response
on CrisisMMD
DISASTER RESPONSE MULTIMODAL DEEP LEARNING SMALL DATA IMAGE CLASSIFICATION