Multimodal Machine Translation

35 papers with code • 3 benchmarks • 5 datasets

Multimodal machine translation is the task of doing machine translation with multiple data sources - for example, translating "a bird is flying over water" + an image of a bird over water to German text.

( Image credit: Findings of the Third Shared Task on Multimodal Machine Translation )

Benchmarks

Add a Result

These leaderboards are used to track progress in Multimodal Machine Translation

Dataset	Best Model	Compare
Multi30K	ERNIE-UniX2	See all
Hindi Visual Genome (Test Set)	ViTA	See all
Hindi Visual Genome (Challenge Set)	ViTA	See all

Libraries

Use these libraries to find Multimodal Machine Translation models and implementations

facebookresearch/seamless_communica…

2 papers

10,270

lium-lst/nmtpy

2 papers

126

Datasets

Subtasks

Latest papers

Most implemented Social Latest No code

Dynamic Context-guided Capsule Network for Multimodal Machine Translation

DeepLearnXMU/MM-DCCN • • 4 Sep 2020

Particularly, we represent the input image with global and regional visual features, we introduce two parallel DCCNs to model multimodal context vectors with visual features at different granularities.

04 Sep 2020

Paper
Code

Multimodal Transformer for Multimodal Machine Translation

QAQ-v/MMT • • ACL 2020

Multimodal Machine Translation (MMT) aims to introduce information from other modality, generally static images, to improve the translation quality.

01 Jul 2020

Paper
Code

Self-Knowledge Distillation with Progressive Refinement of Targets

lgcnsai/ps-kd-pytorch • • ICCV 2021

Hence, it can be interpreted within a framework of knowledge distillation as a student becomes a teacher itself.

22 Jun 2020

Paper
Code

M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-training

microsoft/M3P • • CVPR 2021

We present M3P, a Multitask Multilingual Multimodal Pre-trained model that combines multilingual pre-training and multimodal pre-training into a unified framework via multitask pre-training.

04 Jun 2020

Paper
Code

Distilling Translations with Visual Awareness

ImperialNLP/MMT-Delib • • ACL 2019

Previous work on multimodal machine translation has shown that visual information is only needed in very specific cases, for example in the presence of ambiguous words where the textual context is not sufficient.

18 Jun 2019

Paper
Code

Multimodal Machine Translation with Embedding Prediction

toshohirasawa/nmtpytorch-emb-pred • • NAACL 2019

Multimodal machine translation is an attractive application of neural machine translation (NMT).

01 Apr 2019

Paper
Code

Latent Variable Model for Multi-modal Translation

iacercalixto/variational_mmt • • ACL 2019

In this work, we propose to model the interaction between visual and textual features for multi-modal neural machine translation (MMT) through a latent variable model.

01 Nov 2018

Paper
Code

UMONS Submission for WMT18 Multimodal Translation Task

jbdel/WMT18_MNMT • • 15 Oct 2018

This paper describes the UMONS solution for the Multimodal Machine Translation Task presented at the third conference on machine translation (WMT18).

15 Oct 2018

Paper
Code

Findings of the Third Shared Task on Multimodal Machine Translation

multi30k/dataset • WS 2018

In this task a source sentence in English is supplemented by an image and participating systems are required to generate a translation for such a sentence into German, French or Czech.

160

01 Oct 2018

Paper
Code

A Visual Attention Grounding Neural Model for Multimodal Machine Translation

Eurus-Holmes/VAG-NMT • • EMNLP 2018

The model leverages a visual attention grounding mechanism that links the visual semantics with the corresponding textual semantics.

24 Aug 2018

Paper
Code

Multimodal Machine Translation

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result