Video-Guided Machine Translation
2 papers with code • 2 benchmarks • 0 datasets
Most implemented papers
VATEX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research
We also introduce two tasks for video-and-language research based on VATEX: (1) Multilingual Video Captioning, aimed at describing a video in various languages with a compact unified captioning model, and (2) Video-guided Machine Translation, to translate a source language description into the target language using the video information as additional spatiotemporal context.
Video and Text Matching with Conditioned Embeddings
Traditionally video and text matching is done by learning a shared embedding space and the encoding of one modality is independent of the other.