Search Results for author: Hany Hassan Awadalla

Found 22 papers, 8 papers with code

Dissecting In-Context Learning of Translations in GPTs

no code implementations • 24 Oct 2023 • Vikas Raunak, Hany Hassan Awadalla, Arul Menezes

Most of the recent work in leveraging Large Language Models (LLMs) such as GPT-3 for Machine Translation (MT) has focused on selecting the few-shot samples for prompting.

In-Context Learning Machine Translation +1

Paper
Add Code

Mixture of Quantized Experts (MoQE): Complementary Effect of Low-bit Quantization and Robustness

no code implementations • 3 Oct 2023 • Young Jin Kim, Raffy Fahim, Hany Hassan Awadalla

In our comprehensive analysis, we show that MoE models with 2-bit expert weights can deliver better model performance than the dense model trained on the same dataset.

Machine Translation Quantization

Paper
Add Code

A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models

1 code implementation • 20 Sep 2023 • Haoran Xu, Young Jin Kim, Amr Sharaf, Hany Hassan Awadalla

In this study, we propose a novel fine-tuning approach for LLMs that is specifically designed for the translation task, eliminating the need for the abundant parallel data that traditional translation models usually depend on.

Language Modelling Machine Translation +1

295

Paper
Code

Task-Based MoE for Multitask Multilingual Machine Translation

no code implementations • 30 Aug 2023 • Hai Pham, Young Jin Kim, Subhabrata Mukherjee, David P. Woodruff, Barnabas Poczos, Hany Hassan Awadalla

Mixture-of-experts (MoE) architecture has been proven a powerful method for diverse tasks in training deep models in many applications.

Machine Translation Translation

Paper
Add Code

FineQuant: Unlocking Efficiency with Fine-Grained Weight-Only Quantization for LLMs

no code implementations • 16 Aug 2023 • Young Jin Kim, Rawn Henry, Raffy Fahim, Hany Hassan Awadalla

Large Language Models (LLMs) have achieved state-of-the-art performance across various language tasks but pose challenges for practical deployment due to their substantial memory requirements.

Quantization

Paper
Add Code

Do GPTs Produce Less Literal Translations?

1 code implementation • 26 May 2023 • Vikas Raunak, Arul Menezes, Matt Post, Hany Hassan Awadalla

On the task of Machine Translation (MT), multiple works have investigated few-shot prompting mechanisms to elicit better translations from LLMs.

Machine Translation NMT +3

Paper
Code

ResiDual: Transformer with Dual Residual Connections

1 code implementation • 28 Apr 2023 • Shufang Xie, Huishuai Zhang, Junliang Guo, Xu Tan, Jiang Bian, Hany Hassan Awadalla, Arul Menezes, Tao Qin, Rui Yan

In this paper, we propose ResiDual, a novel Transformer architecture with Pre-Post-LN (PPLN), which fuses the connections in Post-LN and Pre-LN together and inherits their advantages while avoids their limitations.

Machine Translation

Paper
Code

How Good Are GPT Models at Machine Translation? A Comprehensive Evaluation

1 code implementation • 18 Feb 2023 • Amr Hendy, Mohamed Abdelrehim, Amr Sharaf, Vikas Raunak, Mohamed Gabr, Hitokazu Matsushita, Young Jin Kim, Mohamed Afify, Hany Hassan Awadalla

In this paper, we present a comprehensive evaluation of GPT models for machine translation, covering various aspects such as quality of different GPT models in comparison with state-of-the-art research and commercial systems, effect of prompting strategies, robustness towards domain shifts and document-level translation.

Machine Translation Text Generation +1

Paper
Code

Who Says Elephants Can't Run: Bringing Large Scale MoE Models into Cloud Scale Production

no code implementations • 18 Nov 2022 • Young Jin Kim, Rawn Henry, Raffy Fahim, Hany Hassan Awadalla

Mixture of Experts (MoE) models with conditional execution of sparsely activated layers have enabled training models with a much larger number of parameters.

Machine Translation

Paper
Add Code

Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization

1 code implementation • 21 Aug 2022 • Pengcheng He, Baolin Peng, Liyang Lu, Song Wang, Jie Mei, Yang Liu, Ruochen Xu, Hany Hassan Awadalla, Yu Shi, Chenguang Zhu, Wayne Xiong, Michael Zeng, Jianfeng Gao, Xuedong Huang

Z-Code++ creates new state of the art on 9 out of 13 text summarization tasks across 5 languages.

Abstractive Text Summarization Language Modelling +1

1,851

Paper
Code

Language Tokens: A Frustratingly Simple Approach Improves Zero-Shot Performance of Multilingual Translation

no code implementations • 11 Aug 2022 • Muhammad ElNokrashy, Amr Hendy, Mohamed Maher, Mohamed Afify, Hany Hassan Awadalla

In a WMT evaluation campaign, From-English performance improves by 4. 17 and 2. 87 BLEU points, in the zero-shot setting, and when direct data is available for training, respectively.

Translation

Paper
Add Code

Building Multilingual Machine Translation Systems That Serve Arbitrary X-Y Translations

no code implementations • 30 Jun 2022 • Akiko Eriguchi, Shufang Xie, Tao Qin, Hany Hassan Awadalla

Multilingual Neural Machine Translation (MNMT) enables one system to translate sentences from multiple source languages to multiple target languages, greatly reducing deployment costs compared with conventional bilingual systems.

Machine Translation Translation

Paper
Add Code

Gating Dropout: Communication-efficient Regularization for Sparsely Activated Transformers

no code implementations • 28 May 2022 • Rui Liu, Young Jin Kim, Alexandre Muzio, Hany Hassan Awadalla

Sparsely activated transformers, such as Mixture of Experts (MoE), have received great interest due to their outrageous scaling capability which enables dramatical increases in model size without significant increases in computational cost.

Machine Translation

Paper
Add Code

Ensembling of Distilled Models from Multi-task Teachers for Constrained Resource Language Pairs

no code implementations • WMT (EMNLP) 2021 • Amr Hendy, Esraa A. Gad, Mohamed Abdelghaffar, Jailan S. ElMosalami, Mohamed Afify, Ahmed Y. Tawfik, Hany Hassan Awadalla

This paper describes our submission to the constrained track of WMT21 shared news translation task.

Knowledge Distillation Translation

Paper
Add Code

Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task

no code implementations • WMT (EMNLP) 2021 • Jian Yang, Shuming Ma, Haoyang Huang, Dongdong Zhang, Li Dong, Shaohan Huang, Alexandre Muzio, Saksham Singhal, Hany Hassan Awadalla, Xia Song, Furu Wei

This report describes Microsoft's machine translation systems for the WMT21 shared task on large-scale multilingual machine translation.

Machine Translation Translation

Paper
Add Code

Scalable and Efficient MoE Training for Multitask Multilingual Models

1 code implementation • 22 Sep 2021 • Young Jin Kim, Ammar Ahmad Awan, Alexandre Muzio, Andres Felipe Cruz Salinas, Liyang Lu, Amr Hendy, Samyam Rajbhandari, Yuxiong He, Hany Hassan Awadalla

By combining the efficient system and training methods, we are able to significantly scale up large multitask multilingual models for language generation which results in a great improvement in model accuracy.

Machine Translation Text Generation

32,742

Paper
Code

DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders

2 code implementations • 25 Jun 2021 • Shuming Ma, Li Dong, Shaohan Huang, Dongdong Zhang, Alexandre Muzio, Saksham Singhal, Hany Hassan Awadalla, Xia Song, Furu Wei

While pretrained encoders have achieved success in various natural language understanding (NLU) tasks, there is a gap between these pretrained encoders and natural language generation (NLG).

Abstractive Text Summarization Machine Translation +5

18,341

Paper
Code

XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders

no code implementations • 31 Dec 2020 • Shuming Ma, Jian Yang, Haoyang Huang, Zewen Chi, Li Dong, Dongdong Zhang, Hany Hassan Awadalla, Alexandre Muzio, Akiko Eriguchi, Saksham Singhal, Xia Song, Arul Menezes, Furu Wei

Multilingual machine translation enables a single model to translate between different languages.

Language Modelling Machine Translation +2

Paper
Add Code

Score Combination for Improved Parallel Corpus Filtering for Low Resource Conditions

no code implementations • WMT (EMNLP) 2020 • Muhammad N. ElNokrashy, Amr Hendy, Mohamed Abdelghaffar, Mohamed Afify, Ahmed Tawfik, Hany Hassan Awadalla

For the mBART finetuning setup, provided by the organizers, our method shows 7% and 5% relative improvement over baseline, in sacreBLEU score on the test set for Pashto and Khmer respectively.

Sentence

Paper
Add Code

FastFormers: Highly Efficient Transformer Models for Natural Language Understanding

2 code implementations • EMNLP (sustainlp) 2020 • Young Jin Kim, Hany Hassan Awadalla

In this paper, we present FastFormers, a set of recipes to achieve efficient inference-time performance for Transformer-based models on various NLU tasks.

Knowledge Distillation Natural Language Understanding

696

Paper
Code

Multi-task Learning for Multilingual Neural Machine Translation

no code implementations • EMNLP 2020 • Yiren Wang, ChengXiang Zhai, Hany Hassan Awadalla

In this work, we propose a multi-task learning (MTL) framework that jointly trains the model with the translation task on bitext data and two denoising tasks on the monolingual data.

Cross-Lingual Transfer Denoising +4

Paper
Add Code

Detecting Interrogative Utterances with Recurrent Neural Networks

no code implementations • 3 Nov 2015 • Junyoung Chung, Jacob Devlin, Hany Hassan Awadalla

In this paper, we explore different neural network architectures that can predict if a speaker of a given utterance is asking a question or making a statement.

General Classification

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.