Search Results for author: James Cross

Found 26 papers, 11 papers with code

Non-autoregressive Translation with Disentangled Context Transformer

1 code implementation ICML 2020 Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu

State-of-the-art neural machine translation models generate a translation from left to right and every step is conditioned on the previously generated tokens.

Machine Translation Sentence +1

Facebook AI’s WMT21 News Translation Task Submission

1 code implementation WMT (EMNLP) 2021 Chau Tran, Shruti Bhosale, James Cross, Philipp Koehn, Sergey Edunov, Angela Fan

We describe Facebook’s multilingual model submission to the WMT2021 shared task on news translation.

Translation

Efficiently Upgrading Multilingual Machine Translation Models to Support More Languages

no code implementations7 Feb 2023 Simeng Sun, Maha Elbayad, Anna Sun, James Cross

With multilingual machine translation (MMT) models continuing to grow in size and number of supported languages, it is natural to reuse and upgrade existing models to save computation as data becomes available in more languages.

Machine Translation Translation

Multilingual Neural Machine Translation with Deep Encoder and Multiple Shallow Decoders

no code implementations EACL 2021 Xiang Kong, Adithya Renduchintala, James Cross, Yuqing Tang, Jiatao Gu, Xian Li

Recent work in multilingual translation advances translation quality surpassing bilingual baselines using deep transformer models with increased capacity.

Machine Translation Translation

Multilingual Machine Translation with Hyper-Adapters

3 code implementations22 May 2022 Christos Baziotis, Mikel Artetxe, James Cross, Shruti Bhosale

We find that hyper-adapters are more parameter efficient than regular adapters, reaching the same performance with up to 12 times less parameters.

Machine Translation Translation

Lifting the Curse of Multilinguality by Pre-training Modular Transformers

no code implementations NAACL 2022 Jonas Pfeiffer, Naman Goyal, Xi Victoria Lin, Xian Li, James Cross, Sebastian Riedel, Mikel Artetxe

Multilingual pre-trained models are known to suffer from the curse of multilinguality, which causes per-language performance to drop as they cover more languages.

named-entity-recognition Named Entity Recognition +3

Data Selection Curriculum for Neural Machine Translation

no code implementations25 Mar 2022 Tasnim Mohiuddin, Philipp Koehn, Vishrav Chaudhary, James Cross, Shruti Bhosale, Shafiq Joty

In this work, we introduce a two-stage curriculum training framework for NMT where we fine-tune a base NMT model on subsets of data, selected by both deterministic scoring using pre-trained methods and online scoring that considers prediction scores of the emerging NMT model.

Machine Translation NMT +1

Alternative Input Signals Ease Transfer in Multilingual Machine Translation

no code implementations ACL 2022 Simeng Sun, Angela Fan, James Cross, Vishrav Chaudhary, Chau Tran, Philipp Koehn, Francisco Guzman

Further, we find that incorporating alternative inputs via self-ensemble can be particularly effective when training set is small, leading to +5 BLEU when only 5% of the total training data is accessible.

Machine Translation Translation

Tricks for Training Sparse Translation Models

no code implementations NAACL 2022 Dheeru Dua, Shruti Bhosale, Vedanuj Goswami, James Cross, Mike Lewis, Angela Fan

Multi-task learning with an unbalanced data distribution skews model learning towards high resource tasks, especially when model capacity is fixed and fully shared across all tasks.

Machine Translation Multi-Task Learning +1

Classification-based Quality Estimation: Small and Efficient Models for Real-world Applications

no code implementations EMNLP 2021 Shuo Sun, Ahmed El-Kishky, Vishrav Chaudhary, James Cross, Francisco Guzmán, Lucia Specia

Sentence-level Quality estimation (QE) of machine translation is traditionally formulated as a regression task, and the performance of QE models is typically measured by Pearson correlation with human labels.

Machine Translation Model Compression +3

Facebook AI WMT21 News Translation Task Submission

no code implementations6 Aug 2021 Chau Tran, Shruti Bhosale, James Cross, Philipp Koehn, Sergey Edunov, Angela Fan

We describe Facebook's multilingual model submission to the WMT2021 shared task on news translation.

Translation

On the Evaluation of Machine Translation for Terminology Consistency

1 code implementation22 Jun 2021 Md Mahfuz ibn Alam, Antonios Anastasopoulos, Laurent Besacier, James Cross, Matthias Gallé, Philipp Koehn, Vassilina Nikoulina

As neural machine translation (NMT) systems become an important part of professional translator pipelines, a growing body of work focuses on combining NMT with terminologies.

Domain Adaptation Machine Translation +2

Improving Zero-Shot Translation by Disentangling Positional Information

1 code implementation ACL 2021 Danni Liu, Jan Niehues, James Cross, Francisco Guzmán, Xian Li

The difficulty of generalizing to new translation directions suggests the model representations are highly specific to those language pairs seen in training.

Machine Translation Translation

Learn to Talk via Proactive Knowledge Transfer

no code implementations23 Aug 2020 Qing Sun, James Cross

In this paper, we provide an in-depth analysis of KL-divergence minimization in Forward and Backward orders, which shows that learners are reinforced via on-policy learning in Backward.

Knowledge Distillation Machine Translation +2

Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation

2 code implementations ICLR 2021 Jungo Kasai, Nikolaos Pappas, Hao Peng, James Cross, Noah A. Smith

We show that the speed disadvantage for autoregressive baselines compared to non-autoregressive methods has been overestimated in three aspects: suboptimal layer allocation, insufficient speed measurement, and lack of knowledge distillation.

Knowledge Distillation Machine Translation +1

Non-Autoregressive Machine Translation with Disentangled Context Transformer

1 code implementation15 Jan 2020 Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu

State-of-the-art neural machine translation models generate a translation from left to right and every step is conditioned on the previously generated tokens.

Machine Translation Sentence +1

Monotonic Multihead Attention

3 code implementations ICLR 2020 Xutai Ma, Juan Pino, James Cross, Liezl Puzon, Jiatao Gu

Simultaneous machine translation models start generating a target sequence before they have encoded or read the source sequence.

Machine Translation Translation

Proactive Sequence Generator via Knowledge Acquisition

no code implementations25 Sep 2019 Qing Sun, James Cross, Dmitriy Genzel

Sequence-to-sequence models such as transformers, which are now being used in a wide variety of NLP tasks, typically need to have very high capacity in order to perform well.

Knowledge Distillation Sentence

Simple Fusion: Return of the Language Model

1 code implementation WS 2018 Felix Stahlberg, James Cross, Veselin Stoyanov

Neural Machine Translation (NMT) typically leverages monolingual data in training through backtranslation.

Language Modelling Machine Translation +3

Incremental Parsing with Minimal Features Using Bi-Directional LSTM

no code implementations ACL 2016 James Cross, Liang Huang

Recently, neural network approaches for parsing have largely automated the combination of individual features, but still rely on (often a larger number of) atomic features created from human linguistic intuition, and potentially omitting important global context.

Binarization Constituency Parsing +2

Good, Better, Best: Choosing Word Embedding Context

no code implementations19 Nov 2015 James Cross, Bing Xiang, Bo-Wen Zhou

We propose two methods of learning vector representations of words and phrases that each combine sentence context with structural features extracted from dependency trees.

Sentence

Cannot find the paper you are looking for? You can Submit a new open access paper.