Search Results for author: Shuichiro Shimizu

Found 5 papers, 3 papers with code

SlideAVSR: A Dataset of Paper Explanation Videos for Audio-Visual Speech Recognition

no code implementations18 Jan 2024 Hao Wang, Shuhei Kurita, Shuichiro Shimizu, Daisuke Kawahara

Audio-visual speech recognition (AVSR) is a multimodal extension of automatic speech recognition (ASR), using video as a complement to audio.

Audio-Visual Speech Recognition Automatic Speech Recognition +4

Video-Helpful Multimodal Machine Translation

1 code implementation31 Oct 2023 Yihang Li, Shuichiro Shimizu, Chenhui Chu, Sadao Kurohashi, Wei Li

In addition to the extensive training set, EVA contains a video-helpful evaluation set in which subtitles are ambiguous, and videos are guaranteed helpful for disambiguation.

Multimodal Machine Translation Translation

Towards Speech Dialogue Translation Mediating Speakers of Different Languages

1 code implementation16 May 2023 Shuichiro Shimizu, Chenhui Chu, Sheng Li, Sadao Kurohashi

We present a new task, speech dialogue translation mediating speakers of different languages.

Translation

VISA: An Ambiguous Subtitles Dataset for Visual Scene-Aware Machine Translation

1 code implementation LREC 2022 Yihang Li, Shuichiro Shimizu, Weiqi Gu, Chenhui Chu, Sadao Kurohashi

Existing multimodal machine translation (MMT) datasets consist of images and video captions or general subtitles, which rarely contain linguistic ambiguity, making visual information not so effective to generate appropriate translations.

Multimodal Machine Translation Sentence +1

Cannot find the paper you are looking for? You can Submit a new open access paper.