Search Results for author: Ahmad Rashid

Found 25 papers, 5 papers with code

How to Select One Among All ? An Empirical Study Towards the Robustness of Knowledge Distillation in Natural Language Understanding

no code implementations • Findings (EMNLP) 2021 • Tianda Li, Ahmad Rashid, Aref Jafari, Pranav Sharma, Ali Ghodsi, Mehdi Rezagholizadeh

Knowledge Distillation (KD) is a model compression algorithm that helps transfer the knowledge in a large neural network into a smaller one.

Adversarial Robustness Data Augmentation +4

Paper
Add Code

RW-KD: Sample-wise Loss Terms Re-Weighting for Knowledge Distillation

no code implementations • Findings (EMNLP) 2021 • Peng Lu, Abbas Ghaddar, Ahmad Rashid, Mehdi Rezagholizadeh, Ali Ghodsi, Philippe Langlais

Knowledge Distillation (KD) is extensively used in Natural Language Processing to compress the pre-training and task-specific fine-tuning phases of large neural language models.

Knowledge Distillation

Paper
Add Code

Preventing Arbitrarily High Confidence on Far-Away Data in Point-Estimated Discriminative Neural Networks

1 code implementation • 7 Nov 2023 • Ahmad Rashid, Serena Hacker, Guojun Zhang, Agustinus Kristiadi, Pascal Poupart

For instance, ReLU networks - a popular class of neural network architectures - have been shown to almost always yield high confidence predictions when the test data are far away from the training set, even when they are trained with OOD data.

Paper
Code

Attribute Controlled Dialogue Prompting

no code implementations • 11 Jul 2023 • Runcheng Liu, Ahmad Rashid, Ivan Kobyzev, Mehdi Rezagholizadeh, Pascal Poupart

Prompt-tuning has become an increasingly popular parameter-efficient method for adapting large pretrained language models to downstream tasks.

Attribute Dialogue Generation

Paper
Add Code

LABO: Towards Learning Optimal Label Regularization via Bi-level Optimization

no code implementations • 8 May 2023 • Peng Lu, Ahmad Rashid, Ivan Kobyzev, Mehdi Rezagholizadeh, Philippe Langlais

Label Smoothing (LS) is another simple, versatile and efficient regularization which can be applied to various supervised classification tasks.

Image Classification Machine Translation

Paper
Add Code

Improving Generalization of Pre-trained Language Models via Stochastic Weight Averaging

no code implementations • 12 Dec 2022 • Peng Lu, Ivan Kobyzev, Mehdi Rezagholizadeh, Ahmad Rashid, Ali Ghodsi, Philippe Langlais

Moreover, we observe that this simple optimization technique is able to outperform the state-of-the-art KD methods for compact models.

Knowledge Distillation Question Answering +2

Paper
Add Code

Learning Functions on Multiple Sets using Multi-Set Transformers

1 code implementation • 30 Jun 2022 • Kira Selby, Ahmad Rashid, Ivan Kobyzev, Mehdi Rezagholizadeh, Pascal Poupart

We propose a general deep architecture for learning functions on multiple permutation-invariant sets.

Paper
Code

Revisiting Pre-trained Language Models and their Evaluation for Arabic Natural Language Understanding

no code implementations • 21 May 2022 • Abbas Ghaddar, Yimeng Wu, Sunyam Bagga, Ahmad Rashid, Khalil Bibi, Mehdi Rezagholizadeh, Chao Xing, Yasheng Wang, Duan Xinyu, Zhefeng Wang, Baoxing Huai, Xin Jiang, Qun Liu, Philippe Langlais

There is a growing body of work in recent years to develop pre-trained language models (PLMs) for the Arabic language.

Natural Language Understanding

Paper
Add Code

JABER and SABER: Junior and Senior Arabic BERt

1 code implementation • 8 Dec 2021 • Abbas Ghaddar, Yimeng Wu, Ahmad Rashid, Khalil Bibi, Mehdi Rezagholizadeh, Chao Xing, Yasheng Wang, Duan Xinyu, Zhefeng Wang, Baoxing Huai, Xin Jiang, Qun Liu, Philippe Langlais

Language-specific pre-trained models have proven to be more accurate than multilingual ones in a monolingual evaluation setting, Arabic is no exception.

Language Modelling NER

2,961

Paper
Code

NATURE: Natural Auxiliary Text Utterances for Realistic Spoken Language Evaluation

no code implementations • 9 Nov 2021 • David Alfonso-Hermelo, Ahmad Rashid, Abbas Ghaddar, Philippe Langlais, Mehdi Rezagholizadeh

We apply NATURE to common slot-filling and intent detection benchmarks and demonstrate that simple perturbations from the standard evaluation set by NATURE can deteriorate model performance significantly.

Intent Detection slot-filling +1

Paper
Add Code

A Short Study on Compressing Decoder-Based Language Models

no code implementations • 16 Oct 2021 • Tianda Li, Yassir El Mesbahi, Ivan Kobyzev, Ahmad Rashid, Atif Mahmud, Nithin Anchuri, Habib Hajimolahoseini, Yang Liu, Mehdi Rezagholizadeh

Pre-trained Language Models (PLMs) have been successful for a wide range of natural language processing (NLP) tasks.

Knowledge Distillation Model Compression

Paper
Add Code

Kronecker Decomposition for GPT Compression

no code implementations • ACL 2022 • Ali Edalati, Marzieh Tahaei, Ahmad Rashid, Vahid Partovi Nia, James J. Clark, Mehdi Rezagholizadeh

GPT is an auto-regressive Transformer-based pre-trained language model which has attracted a lot of attention in the natural language processing (NLP) domain due to its state-of-the-art performance in several downstream tasks.

Knowledge Distillation Language Modelling +1

Paper
Add Code

Pseudo Knowledge Distillation: Towards Learning Optimal Instance-specific Label Smoothing Regularization

no code implementations • 29 Sep 2021 • Peng Lu, Ahmad Rashid, Ivan Kobyzev, Mehdi Rezagholizadeh, Philippe Langlais

Knowledge Distillation (KD) is an algorithm that transfers the knowledge of a trained, typically larger, neural network into another model under training.

Image Classification Knowledge Distillation +1

Paper
Add Code

Knowledge Distillation with Noisy Labels for Natural Language Understanding

no code implementations • WNUT (ACL) 2021 • Shivendra Bhardwaj, Abbas Ghaddar, Ahmad Rashid, Khalil Bibi, Chengyang Li, Ali Ghodsi, Philippe Langlais, Mehdi Rezagholizadeh

Knowledge Distillation (KD) is extensively used to compress and deploy large pre-trained language models on edge devices for real-world applications.

Knowledge Distillation Natural Language Understanding

Paper
Add Code

How to Select One Among All? An Extensive Empirical Study Towards the Robustness of Knowledge Distillation in Natural Language Understanding

1 code implementation • 13 Sep 2021 • Tianda Li, Ahmad Rashid, Aref Jafari, Pranav Sharma, Ali Ghodsi, Mehdi Rezagholizadeh

Knowledge Distillation (KD) is a model compression algorithm that helps transfer the knowledge of a large neural network into a smaller one.

Adversarial Robustness Data Augmentation +4

Paper
Code

End-to-End Self-Debiasing Framework for Robust NLU Training

no code implementations • Findings (ACL) 2021 • Abbas Ghaddar, Philippe Langlais, Mehdi Rezagholizadeh, Ahmad Rashid

Existing Natural Language Understanding (NLU) models have been shown to incorporate dataset biases leading to strong performance on in-distribution (ID) test sets but poor performance on out-of-distribution (OOD) ones.

Natural Language Understanding

Paper
Add Code

Context-aware Adversarial Training for Name Regularity Bias in Named Entity Recognition

no code implementations • 24 Jul 2021 • Abbas Ghaddar, Philippe Langlais, Ahmad Rashid, Mehdi Rezagholizadeh

In this work, we examine the ability of NER models to use contextual information when predicting the type of an ambiguous entity.

Data Augmentation named-entity-recognition +2

Paper
Add Code

MATE-KD: Masked Adversarial TExt, a Companion to Knowledge Distillation

1 code implementation • ACL 2021 • Ahmad Rashid, Vasileios Lioutas, Mehdi Rezagholizadeh

We present, MATE-KD, a novel text-based adversarial training algorithm which improves the performance of knowledge distillation.

Adversarial Text Data Augmentation +2

Paper
Code

Robust Embeddings Via Distributions

no code implementations • 17 Apr 2021 • Kira A. Selby, Yinong Wang, Ruizhe Wang, Peyman Passban, Ahmad Rashid, Mehdi Rezagholizadeh, Pascal Poupart

Despite recent monumental advances in the field, many Natural Language Processing (NLP) models still struggle to perform adequately on noisy domains.

Paper
Add Code

Towards Zero-Shot Knowledge Distillation for Natural Language Processing

no code implementations • EMNLP 2021 • Ahmad Rashid, Vasileios Lioutas, Abbas Ghaddar, Mehdi Rezagholizadeh

Knowledge Distillation (KD) is a common knowledge transfer algorithm used for model compression across a variety of deep learning based natural language processing (NLP) solutions.

Knowledge Distillation Model Compression +1

Paper
Add Code

From Unsupervised Machine Translation To Adversarial Text Generation

no code implementations • 10 Nov 2020 • Ahmad Rashid, Alan Do-Omri, Md. Akmal Haidar, Qun Liu, Mehdi Rezagholizadeh

B-GAN is able to generate a distributed latent space representation which can be paired with an attention based decoder to generate fluent sentences.

Adversarial Text Text Generation +2

Paper
Add Code

Improving Word Embedding Factorization for Compression Using Distilled Nonlinear Neural Decomposition

no code implementations • Findings of the Association for Computational Linguistics 2020 • Vasileios Lioutas, Ahmad Rashid, Krtin Kumar, Md. Akmal Haidar, Mehdi Rezagholizadeh

Word-embeddings are vital components of Natural Language Processing (NLP) models and have been extensively explored.

Knowledge Distillation Language Modelling +3

Paper
Add Code

Distilled embedding: non-linear embedding factorization using knowledge distillation

no code implementations • 25 Sep 2019 • Vasileios Lioutas, Ahmad Rashid, Krtin Kumar, Md Akmal Haidar, Mehdi Rezagholizadeh

Word-embeddings are a vital component of Natural Language Processing (NLP) systems and have been extensively researched.

Knowledge Distillation Machine Translation +2

Paper
Add Code

Latent Code and Text-based Generative Adversarial Networks for Soft-text Generation

no code implementations • NAACL 2019 • Md. Akmal Haidar, Mehdi Rezagholizadeh, Alan Do-Omri, Ahmad Rashid

This soft representation will be used in GAN discrimination to synthesize similar soft-texts.

Text Generation

Paper
Add Code

Bilingual-GAN: A Step Towards Parallel Text Generation

no code implementations • WS 2019 • Ahmad Rashid, Alan Do-Omri, Md. Akmal Haidar, Qun Liu, Mehdi Rezagholizadeh

Latent space based GAN methods and attention based sequence to sequence models have achieved impressive results in text generation and unsupervised machine translation respectively.

Denoising Text Generation +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.