Search Results for author: Atsushi Fujita

Found 39 papers, 5 papers with code

Combining Sequence Distillation and Transfer Learning for Efficient Low-Resource Neural Machine Translation Models

no code implementations WMT (EMNLP) 2020 Raj Dabre, Atsushi Fujita

This paper investigates a combination of SD and TL for training efficient NMT models for ELR settings, where we utilize TL with helping corpora twice: once for distilling the ELR corpora and then during compact model training.

Low-Resource Neural Machine Translation NMT +3

Combination of Neural Machine Translation Systems at WMT20

no code implementations WMT (EMNLP) 2020 Benjamin Marie, Raphael Rubino, Atsushi Fujita

This paper presents neural machine translation systems and their combination built for the WMT20 English-Polish and Japanese->English translation tasks.

Machine Translation NMT +1

Attainable Text-to-Text Machine Translation vs. Translation: Issues Beyond Linguistic Processing

1 code implementation MTSummit 2021 Atsushi Fujita

Existing approaches for machine translation (MT) mostly translate given text in the source language into the target language and without explicitly referring to information indispensable for producing proper translation.

Machine Translation Translation

Investigating Softmax Tempering for Training Neural Machine Translation Models

no code implementations MTSummit 2021 Raj Dabre, Atsushi Fujita

In low-resource scenarios and NMT models tend to perform poorly because the model training quickly converges to a point where the softmax distribution computed using logits approaches the gold label distribution.

Machine Translation NMT +1

Unsupervised Translation Quality Estimation Exploiting Synthetic Data and Pre-trained Multilingual Encoder

no code implementations9 Nov 2023 Yuto Kuroda, Atsushi Fujita, Tomoyuki Kajiwara, Takashi Ninomiya

In this paper, we extensively investigate the usefulness of synthetic TQE data and pre-trained multilingual encoders in unsupervised sentence-level TQE, both of which have been proven effective in the supervised training scenarios.

Sentence Translation

Bilingual Corpus Mining and Multistage Fine-Tuning for Improving Machine Translation of Lecture Transcripts

1 code implementation7 Nov 2023 Haiyue Song, Raj Dabre, Chenhui Chu, Atsushi Fujita, Sadao Kurohashi

To create the parallel corpora, we propose a dynamic programming based sentence alignment algorithm which leverages the cosine similarity of machine-translated sentences.

Benchmarking Machine Translation +3

Scientific Credibility of Machine Translation Research: A Meta-Evaluation of 769 Papers

2 code implementations ACL 2021 Benjamin Marie, Atsushi Fujita, Raphael Rubino

MT evaluations in recent papers tend to copy and compare automatic metric scores from previous work to claim the superiority of a method or an algorithm without confirming neither exactly the same training, validating, and testing data have been used nor the metric scores are comparable.

Machine Translation Translation

Recurrent Stacking of Layers in Neural Networks: An Application to Neural Machine Translation

no code implementations18 Jun 2021 Raj Dabre, Atsushi Fujita

Finally, we analyze the effects of recurrently stacked layers by visualizing the attentions of models that use recurrently stacked layers and models that do not.

Knowledge Distillation Machine Translation +3

Understanding Pre-Editing for Black-Box Neural Machine Translation

no code implementations EACL 2021 Rei Miyata, Atsushi Fujita

Pre-editing is the process of modifying the source text (ST) so that it can be translated by machine translation (MT) in a better quality.

Machine Translation NMT +1

Synthesizing Monolingual Data for Neural Machine Translation

no code implementations29 Jan 2021 Benjamin Marie, Atsushi Fujita

Nonetheless, large monolingual data in the target domains or languages are not always available to generate large synthetic parallel data.

Machine Translation NMT +1

Softmax Tempering for Training Neural Machine Translation Models

no code implementations20 Sep 2020 Raj Dabre, Atsushi Fujita

Neural machine translation (NMT) models are typically trained using a softmax cross-entropy loss where the softmax distribution is compared against smoothed gold labels.

Machine Translation NMT +1

Tagged Back-translation Revisited: Why Does It Really Work?

no code implementations ACL 2020 Benjamin Marie, Raphael Rubino, Atsushi Fujita

In this paper, we show that neural machine translation (NMT) systems trained on large back-translated data overfit some of the characteristics of machine-translated texts.

Machine Translation NMT +2

Balancing Cost and Benefit with Tied-Multi Transformers

no code implementations WS 2020 Raj Dabre, Raphael Rubino, Atsushi Fujita

We propose and evaluate a novel procedure for training multiple Transformers with tied parameters which compresses multiple models into one enabling the dynamic choice of the number of encoder and decoder layers during decoding.

Knowledge Distillation Machine Translation +2

Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation

1 code implementation LREC 2020 Haiyue Song, Raj Dabre, Atsushi Fujita, Sadao Kurohashi

To address this, we examine a language independent framework for parallel corpus mining which is a quick and effective way to mine a parallel corpus from publicly available lectures at Coursera.

Benchmarking Domain Adaptation +4

Supervised and Unsupervised Machine Translation for Myanmar-English and Khmer-English

no code implementations WS 2019 Benjamin Marie, Hour Kaing, Aye Myat Mon, Chenchen Ding, Atsushi Fujita, Masao Utiyama, Eiichiro Sumita

This paper presents the NICT{'}s supervised and unsupervised machine translation systems for the WAT2019 Myanmar-English and Khmer-English translation tasks.

NMT Translation +1

Exploiting Multilingualism through Multistage Fine-Tuning for Low-Resource Neural Machine Translation

no code implementations IJCNLP 2019 Raj Dabre, Atsushi Fujita, Chenhui Chu

This paper highlights the impressive utility of multi-parallel corpora for transfer learning in a one-to-many low-resource neural machine translation (NMT) setting.

Low-Resource Neural Machine Translation NMT +2

Multi-Layer Softmaxing during Training Neural Machine Translation for Flexible Decoding with Fewer Layers

no code implementations27 Aug 2019 Raj Dabre, Atsushi Fujita

This paper proposes a novel procedure for training an encoder-decoder based deep neural network which compresses NxM models into a single model enabling us to dynamically choose the number of encoder and decoder layers for decoding.

Machine Translation Translation

NICT's Supervised Neural Machine Translation Systems for the WMT19 News Translation Task

no code implementations WS 2019 Raj Dabre, Kehai Chen, Benjamin Marie, Rui Wang, Atsushi Fujita, Masao Utiyama, Eiichiro Sumita

In this paper, we describe our supervised neural machine translation (NMT) systems that we developed for the news translation task for Kazakh↔English, Gujarati↔English, Chinese↔English, and English→Finnish translation directions.

Machine Translation NMT +2

Exploiting Out-of-Domain Parallel Data through Multilingual Transfer Learning for Low-Resource Neural Machine Translation

1 code implementation WS 2019 Aizhan Imankulova, Raj Dabre, Atsushi Fujita, Kenji Imamura

This paper proposes a novel multilingual multistage fine-tuning approach for low-resource neural machine translation (NMT), taking a challenging Japanese--Russian pair for benchmarking.

Benchmarking Domain Adaptation +4

Unsupervised Joint Training of Bilingual Word Embeddings

no code implementations ACL 2019 Benjamin Marie, Atsushi Fujita

State-of-the-art methods for unsupervised bilingual word embeddings (BWE) train a mapping function that maps pre-trained monolingual word embeddings into a bilingual space.

Translation Unsupervised Machine Translation +1

Unsupervised Extraction of Partial Translations for Neural Machine Translation

no code implementations NAACL 2019 Benjamin Marie, Atsushi Fujita

We propose a new algorithm for extracting from monolingual data what we call partial translations: pairs of source and target sentences that contain sequences of tokens that are translations of each other.

Machine Translation NMT +1

Recurrent Stacking of Layers for Compact Neural Machine Translation Models

no code implementations14 Jul 2018 Raj Dabre, Atsushi Fujita

In neural machine translation (NMT), the most common practice is to stack a number of recurrent or feed-forward layers in the encoder and the decoder.

Machine Translation NMT +1

Japanese to English/Chinese/Korean Datasets for Translation Quality Estimation and Automatic Post-Editing

no code implementations WS 2017 Atsushi Fujita, Eiichiro Sumita

Aiming at facilitating the research on quality estimation (QE) and automatic post-editing (APE) of machine translation (MT) outputs, especially for those among Asian languages, we have created new datasets for Japanese to English, Chinese, and Korean translations.

Automatic Post-Editing Benchmarking +2

Efficient Extraction of Pseudo-Parallel Sentences from Raw Monolingual Data Using Word Embeddings

no code implementations ACL 2017 Benjamin Marie, Atsushi Fujita

We propose a new method for extracting pseudo-parallel sentences from a pair of large monolingual corpora, without relying on any document-level information.

Domain Adaptation Information Retrieval +4

Phrase Table Induction Using In-Domain Monolingual Data for Domain Adaptation in Statistical Machine Translation

no code implementations TACL 2017 Benjamin Marie, Atsushi Fujita

We present a new framework to induce an in-domain phrase table from in-domain monolingual data that can be used to adapt a general-domain statistical machine translation system to the targeted domain.

Domain Adaptation Machine Translation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.