Multimodal Machine Translation

35 papers with code • 3 benchmarks • 5 datasets

Multimodal machine translation is the task of doing machine translation with multiple data sources - for example, translating "a bird is flying over water" + an image of a bird over water to German text.

( Image credit: Findings of the Third Shared Task on Multimodal Machine Translation )

Libraries

Use these libraries to find Multimodal Machine Translation models and implementations

Dynamic Context-guided Capsule Network for Multimodal Machine Translation

DeepLearnXMU/MM-DCCN 4 Sep 2020

Particularly, we represent the input image with global and regional visual features, we introduce two parallel DCCNs to model multimodal context vectors with visual features at different granularities.

41
04 Sep 2020

Multimodal Transformer for Multimodal Machine Translation

QAQ-v/MMT ACL 2020

Multimodal Machine Translation (MMT) aims to introduce information from other modality, generally static images, to improve the translation quality.

15
01 Jul 2020

Self-Knowledge Distillation with Progressive Refinement of Targets

lgcnsai/ps-kd-pytorch ICCV 2021

Hence, it can be interpreted within a framework of knowledge distillation as a student becomes a teacher itself.

80
22 Jun 2020

M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-training

microsoft/M3P CVPR 2021

We present M3P, a Multitask Multilingual Multimodal Pre-trained model that combines multilingual pre-training and multimodal pre-training into a unified framework via multitask pre-training.

67
04 Jun 2020

Distilling Translations with Visual Awareness

ImperialNLP/MMT-Delib ACL 2019

Previous work on multimodal machine translation has shown that visual information is only needed in very specific cases, for example in the presence of ambiguous words where the textual context is not sufficient.

10
18 Jun 2019

Multimodal Machine Translation with Embedding Prediction

toshohirasawa/nmtpytorch-emb-pred NAACL 2019

Multimodal machine translation is an attractive application of neural machine translation (NMT).

2
01 Apr 2019

Latent Variable Model for Multi-modal Translation

iacercalixto/variational_mmt ACL 2019

In this work, we propose to model the interaction between visual and textual features for multi-modal neural machine translation (MMT) through a latent variable model.

16
01 Nov 2018

UMONS Submission for WMT18 Multimodal Translation Task

jbdel/WMT18_MNMT 15 Oct 2018

This paper describes the UMONS solution for the Multimodal Machine Translation Task presented at the third conference on machine translation (WMT18).

5
15 Oct 2018

Findings of the Third Shared Task on Multimodal Machine Translation

multi30k/dataset WS 2018

In this task a source sentence in English is supplemented by an image and participating systems are required to generate a translation for such a sentence into German, French or Czech.

160
01 Oct 2018

A Visual Attention Grounding Neural Model for Multimodal Machine Translation

Eurus-Holmes/VAG-NMT EMNLP 2018

The model leverages a visual attention grounding mechanism that links the visual semantics with the corresponding textual semantics.

10
24 Aug 2018