TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Multimodal Sentiment Analysis	CH-SIMS	ALMT	F1	81.57	# 2
Multimodal Sentiment Analysis	CH-SIMS	ALMT	MAE	0.404	# 2
Multimodal Sentiment Analysis	CH-SIMS	ALMT	CORR	0.619	# 1
Multimodal Sentiment Analysis	CH-SIMS	ALMT	Acc-5	45.73	# 1
Multimodal Sentiment Analysis	CH-SIMS	ALMT	Acc-3	68.93	# 1
Multimodal Sentiment Analysis	CH-SIMS	ALMT	Acc-2	81.19	# 1
Multimodal Sentiment Analysis	CMU-MOSEI	ALMT	MAE	0.526	# 3
Multimodal Sentiment Analysis	CMU-MOSEI	ALMT	Acc-7	54.28	# 1
Multimodal Sentiment Analysis	CMU-MOSEI	ALMT	Acc-5	55.96	# 1
Multimodal Sentiment Analysis	CMU-MOSEI	ALMT	Corr	0.779	# 1
Multimodal Sentiment Analysis	CMU-MOSI	ALMT	MAE	0.683	# 2
Multimodal Sentiment Analysis	CMU-MOSI	ALMT	Corr	0.805	# 3
Multimodal Sentiment Analysis	CMU-MOSI	ALMT	Acc-7	49.42	# 1
Multimodal Sentiment Analysis	CMU-MOSI	ALMT	Acc-5	56.41	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/learning-language-guided-adaptive-hyper/multimodal-sentiment-analysis-on-cmu-mosei-1)](https://paperswithcode.com/sota/multimodal-sentiment-analysis-on-cmu-mosei-1?p=learning-language-guided-adaptive-hyper)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/learning-language-guided-adaptive-hyper/multimodal-sentiment-analysis-on-cmu-mosi)](https://paperswithcode.com/sota/multimodal-sentiment-analysis-on-cmu-mosi?p=learning-language-guided-adaptive-hyper)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/learning-language-guided-adaptive-hyper/multimodal-sentiment-analysis-on-ch-sims)](https://paperswithcode.com/sota/multimodal-sentiment-analysis-on-ch-sims?p=learning-language-guided-adaptive-hyper)`

Learning Language-guided Adaptive Hyper-modality Representation for Multimodal Sentiment Analysis

9 Oct 2023 · Haoyu Zhang, Yu Wang, Guanghao Yin, Kejun Liu, Yuanyuan Liu, Tianshu Yu ·

Though Multimodal Sentiment Analysis (MSA) proves effective by utilizing rich information from multiple sources (e.g., language, video, and audio), the potential sentiment-irrelevant and conflicting information across modalities may hinder the performance from being further improved. To alleviate this, we present Adaptive Language-guided Multimodal Transformer (ALMT), which incorporates an Adaptive Hyper-modality Learning (AHL) module to learn an irrelevance/conflict-suppressing representation from visual and audio features under the guidance of language features at different scales. With the obtained hyper-modality representation, the model can obtain a complementary and joint representation through multimodal fusion for effective MSA. In practice, ALMT achieves state-of-the-art performance on several popular datasets (e.g., MOSI, MOSEI and CH-SIMS) and an abundance of ablation demonstrates the validity and necessity of our irrelevance/conflict suppression mechanism.

PDF Abstract

Code

Add Remove Mark official

Haoyu-ha/ALMT official

Tasks

Add Remove

Multimodal Sentiment Analysis

Sentiment Analysis

Datasets

CMU-MOSEI

CH-SIMS

CMU-MOSI

Results from the Paper

Add Remove

Ranked #1 on Multimodal Sentiment Analysis on CMU-MOSEI (Acc-7 metric)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Multimodal Sentiment Analysis	CH-SIMS	ALMT	F1	81.57	# 2	Compare
			MAE	0.404	# 2	Compare
			CORR	0.619	# 1	Compare
			Acc-5	45.73	# 1	Compare
			Acc-3	68.93	# 1	Compare
			Acc-2	81.19	# 1	Compare
Multimodal Sentiment Analysis	CMU-MOSEI	ALMT	MAE	0.526	# 3	Compare
			Acc-7	54.28	# 1	Compare
			Acc-5	55.96	# 1	Compare
			Corr	0.779	# 1	Compare
Multimodal Sentiment Analysis	CMU-MOSI	ALMT	MAE	0.683	# 2	Compare
			Corr	0.805	# 3	Compare
			Acc-7	49.42	# 1	Compare
			Acc-5	56.41	# 1	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • BPE • Dense Connections • Dropout • Label Smoothing • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer

Edit Social Preview

Learning Language-guided Adaptive Hyper-modality Representation for Multimodal Sentiment Analysis

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove