About

Benchmarks

TREND DATASET BEST METHOD PAPER TITLE PAPER CODE COMPARE

Subtasks

Greatest papers with code

ShapeWorld - A new test methodology for multimodal language understanding

14 Apr 2017AlexKuhnle/ShapeWorld

We introduce a novel framework for evaluating multimodal deep learning models with respect to their language understanding and generalization abilities.

MULTIMODAL DEEP LEARNING VISUAL QUESTION ANSWERING

Learn to Combine Modalities in Multimodal Deep Learning

29 May 2018skywaLKer518/MultiplicativeMultimodal

Combining complementary information from multiple modalities is intuitively appealing for improving the performance of learning-based approaches.

MULTIMODAL DEEP LEARNING

More Diverse Means Better: Multimodal Deep Learning Meets Remote Sensing Imagery Classification

12 Aug 2020danfenghong/IEEE_TGRS_MDL-RS

In particular, we also investigate a special case of multi-modality learning (MML) -- cross-modality learning (CML) that exists widely in RS image classification applications.

IMAGE CLASSIFICATION MULTIMODAL DEEP LEARNING

Audio-Conditioned U-Net for Position Estimation in Full Sheet Images

16 Oct 2019CPJKU/audio_conditioned_unet

The goal of score following is to track a musical performance, usually in the form of audio, in a corresponding score representation.

MULTIMODAL DEEP LEARNING

XFlow: Cross-modal Deep Neural Networks for Audiovisual Classification

2 Sep 2017catalina17/XFlow

Our work improves on existing multimodal deep learning algorithms in two essential ways: (1) it presents a novel method for performing cross-modality (before features are learned from individual modalities) and (2) extends the previously proposed cross-connections which only transfer information between streams that process compatible data.

LIP READING MULTIMODAL DEEP LEARNING

Multimodal Emotion Recognition with Transformer-Based Self Supervised Feature Fusion

27 Oct 2020shamanez/Self-Supervised-Embedding-Fusion-Transformer

Emotion Recognition is a challenging research area given its complex nature, and humans express emotional cues across various modalities such as language, facial expressions, and speech.

MULTIMODAL DEEP LEARNING MULTIMODAL EMOTION RECOGNITION MULTIMODAL SENTIMENT ANALYSIS REPRESENTATION LEARNING SELF-SUPERVISED LEARNING