Search Results for author: Haidar Khan

Found 19 papers, 3 papers with code

Limitations of Knowledge Distillation for Zero-shot Transfer Learning

no code implementations • EMNLP (sustainlp) 2021 • Saleh Soltan, Haidar Khan, Wael Hamza

We demonstrate that in contradiction to the previous observation in the case of monolingual distillation, in multilingual settings, distillation during pretraining is more effective than distillation during fine-tuning for zero-shot transfer learning.

Knowledge Distillation Transfer Learning +1

Paper
Add Code

Controlled Data Generation via Insertion Operations for NLU

no code implementations • NAACL (ACL) 2022 • Manoj Kumar, Yuval Merhav, Haidar Khan, Rahul Gupta, Anna Rumshisky, Wael Hamza

Use of synthetic data is rapidly emerging as a realistic alternative to manually annotating live traffic for industry-scale model building.

intent-classification Intent Classification +4

Paper
Add Code

When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards

no code implementations • 1 Feb 2024 • Norah Alzahrani, Hisham Abdullah Alyahya, Yazeed Alnumay, Sultan Alrashed, Shaykhah Alsubaie, Yusef Almushaykeh, Faisal Mirza, Nouf Alotaibi, Nora AlTwairesh, Areeb Alowisheq, M Saiful Bari, Haidar Khan

Large Language Model (LLM) leaderboards based on benchmark rankings are regularly used to guide practitioners in model selection.

Answer Selection Language Modelling +3

Paper
Add Code

Controlling the Extraction of Memorized Data from Large Language Models via Prompt-Tuning

1 code implementation • 19 May 2023 • Mustafa Safa Ozdayi, Charith Peris, Jack FitzGerald, Christophe Dupuy, Jimit Majmudar, Haidar Khan, Rahil Parikh, Rahul Gupta

We present a novel approach which uses prompt-tuning to control the extraction rates of memorized content in LLMs.

Paper
Code

Low-Resource Compositional Semantic Parsing with Concept Pretraining

no code implementations • 24 Jan 2023 • Subendhu Rongali, Mukund Sridhar, Haidar Khan, Konstantine Arkoudas, Wael Hamza, Andrew McCallum

In this work, we present an architecture to perform such domain adaptation automatically, with only a small amount of metadata about the new domain and without any new training data (zero-shot) or with very few examples (few-shot).

Domain Adaptation Semantic Parsing

Paper
Add Code

AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model

1 code implementation • 2 Aug 2022 • Saleh Soltan, Shankar Ananthakrishnan, Jack FitzGerald, Rahul Gupta, Wael Hamza, Haidar Khan, Charith Peris, Stephen Rawls, Andy Rosenbaum, Anna Rumshisky, Chandana Satya Prakash, Mukund Sridhar, Fabian Triefenbach, Apurv Verma, Gokhan Tur, Prem Natarajan

In this work, we demonstrate that multilingual large-scale sequence-to-sequence (seq2seq) models, pre-trained on a mixture of denoising and Causal Language Modeling (CLM) tasks, are more efficient few-shot learners than decoder-only models on various tasks.

Ranked #14 on Natural Language Inference on CommitmentBank

Causal Language Modeling Common Sense Reasoning +8

363

Paper
Code

Alexa Teacher Model: Pretraining and Distilling Multi-Billion-Parameter Encoders for Natural Language Understanding Systems

no code implementations • 15 Jun 2022 • Jack FitzGerald, Shankar Ananthakrishnan, Konstantine Arkoudas, Davide Bernardi, Abhishek Bhagia, Claudio Delli Bovi, Jin Cao, Rakesh Chada, Amit Chauhan, Luoxin Chen, Anurag Dwarakanath, Satyam Dwivedi, Turan Gojayev, Karthik Gopalakrishnan, Thomas Gueudre, Dilek Hakkani-Tur, Wael Hamza, Jonathan Hueser, Kevin Martin Jose, Haidar Khan, Beiye Liu, Jianhua Lu, Alessandro Manzotti, Pradeep Natarajan, Karolina Owczarzak, Gokmen Oz, Enrico Palumbo, Charith Peris, Chandana Satya Prakash, Stephen Rawls, Andy Rosenbaum, Anjali Shenoy, Saleh Soltan, Mukund Harakere Sridhar, Liz Tan, Fabian Triefenbach, Pan Wei, Haiyang Yu, Shuai Zheng, Gokhan Tur, Prem Natarajan

We present results from a large-scale experiment on pretraining encoders with non-embedding parameter counts ranging from 700M to 9. 3B, their subsequent distillation into smaller models ranging from 17M-170M parameters, and their application to the Natural Language Understanding (NLU) component of a virtual assistant system.

Cross-Lingual Natural Language Inference intent-classification +5

Paper
Add Code

Unfreeze with Care: Space-Efficient Fine-Tuning of Semantic Parsing Models

no code implementations • 5 Mar 2022 • Weiqi Sun, Haidar Khan, Nicolas Guenon des Mesnards, Melanie Rubino, Konstantine Arkoudas

We examine two such promising techniques, prefix tuning and bias-term tuning, specifically on semantic parsing.

Language Modelling Semantic Parsing

Paper
Add Code

RescoreBERT: Discriminative Speech Recognition Rescoring with BERT

no code implementations • 2 Feb 2022 • Liyan Xu, Yile Gu, Jari Kolehmainen, Haidar Khan, Ankur Gandhe, Ariya Rastrow, Andreas Stolcke, Ivan Bulyko

Specifically, training a bidirectional model like BERT on a discriminative objective such as minimum WER (MWER) has not been explored.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Output Randomization: A Novel Defense for both White-box and Black-box Adversarial Models

no code implementations • 8 Jul 2021 • Daniel Park, Haidar Khan, Azer Khan, Alex Gittens, Bülent Yener

Adversarial examples pose a threat to deep neural network models in a variety of scenarios, from settings where the adversary has complete knowledge of the model in a "white box" setting and to the opposite in a "black box" setting.

Paper
Add Code

Using multiple ASR hypotheses to boost i18n NLU performance

no code implementations • ICON 2020 • Charith Peris, Gokmen Oz, Khadige Abboud, Venkata sai Varada, Prashan Wanigasekara, Haidar Khan

For IC and NER multi-task experiments, when evaluating on the mismatched test set, we see improvements across all domains in German and in 17 out of 19 domains in Portuguese (improvements based on change in SeMER scores).

Abstractive Text Summarization Automatic Speech Recognition +10

Paper
Add Code

Compressing Transformer-Based Semantic Parsing Models using Compositional Code Embeddings

no code implementations • Findings of the Association for Computational Linguistics 2020 • Prafull Prakash, Saurabh Kumar Shashidhar, Wenlong Zhao, Subendhu Rongali, Haidar Khan, Michael Kayser

The current state-of-the-art task-oriented semantic parsing models use BERT or RoBERTa as pretrained encoders; these models have huge memory footprints.

Semantic Parsing

Paper
Add Code

Don't Parse, Insert: Multilingual Semantic Parsing with Insertion Based Decoding

no code implementations • CONLL 2020 • Qile Zhu, Haidar Khan, Saleh Soltan, Stephen Rawls, Wael Hamza

For complex parsing tasks, the state-of-the-art method is based on autoregressive sequence to sequence models to generate the parse directly.

Natural Language Understanding Semantic Parsing +4

Paper
Add Code

Optimal Mini-Batch Size Selection for Fast Gradient Descent

no code implementations • 15 Nov 2019 • Michael P. Perrone, Haidar Khan, Changhoan Kim, Anastasios Kyrillidis, Jerry Quinn, Valentina Salapura

This paper presents a methodology for selecting the mini-batch size that minimizes Stochastic Gradient Descent (SGD) learning time for single and multiple learner problems.

Machine Translation Translation

Paper
Add Code

Deep density ratio estimation for change point detection

no code implementations • 23 May 2019 • Haidar Khan, Lara Marcuse, Bülent Yener

In this work, we propose new objective functions to train deep neural network based density ratio estimators and apply it to a change point detection problem.

Change Point Detection Density Ratio Estimation +1

Paper
Add Code

Thwarting finite difference adversarial attacks with output randomization

no code implementations • ICLR 2020 • Haidar Khan, Daniel Park, Azer Khan, Bülent Yener

Adversarial examples pose a threat to deep neural network models in a variety of scenarios, from settings where the adversary has complete knowledge of the model and to the opposite "black box" setting.

Adversarial Attack

Paper
Add Code

Generation & Evaluation of Adversarial Examples for Malware Obfuscation

no code implementations • 9 Apr 2019 • Daniel Park, Haidar Khan, Bülent Yener

There has been an increased interest in the application of convolutional neural networks for image based malware classification, but the susceptibility of neural networks to adversarial examples allows malicious actors to evade classifiers.

General Classification Malware Classification

Paper
Add Code

Learning filter widths of spectral decompositions with wavelets

1 code implementation • NeurIPS 2018 • Haidar Khan, Bulent Yener

Our results show that the WD layer can improve neural network based time series classifiers both in accuracy and interpretability by learning directly from the input signal.

Time Series Time Series Analysis +1

Paper
Code

Focal onset seizure prediction using convolutional networks

no code implementations • 29 May 2018 • Haidar Khan, Lara Marcuse, Madeline Fields, Kalina Swann, Bülent Yener

Significance: We demonstrate that a robust set of features can be learned from scalp EEG that characterize the preictal state of focal seizures.

EEG Seizure prediction

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.