Search Results for author: Madian Khabsa

Found 33 papers, 15 papers with code

Question Generation using a Scratchpad Encoder

no code implementations ICLR 2019 Ryan Y. Benmalek, Madian Khabsa, Suma Desu, Claire Cardie, Michele Banko

In this paper we introduce the Scratchpad Encoder, a novel addition to the sequence to sequence (seq2seq) framework and explore its effectiveness in generating natural language questions from a given logical form.

Question Generation Question-Generation

RoAST: Robustifying Language Models via Adversarial Perturbation with Selective Training

1 code implementation7 Dec 2023 Jaehyung Kim, Yuning Mao, Rui Hou, Hanchao Yu, Davis Liang, Pascale Fung, Qifan Wang, Fuli Feng, Lifu Huang, Madian Khabsa

Under a unified evaluation of fine-tuned LMs by incorporating four representative perspectives of model robustness, we demonstrate the effectiveness of RoAST compared to state-of-the-art fine-tuning methods on six different types of LMs, which indicates its usefulness in practice.

Adversarial Robustness

MART: Improving LLM Safety with Multi-round Automatic Red-Teaming

no code implementations13 Nov 2023 Suyu Ge, Chunting Zhou, Rui Hou, Madian Khabsa, Yi-Chia Wang, Qifan Wang, Jiawei Han, Yuning Mao

Specifically, an adversarial LLM and a target LLM interplay with each other in an iterative manner, where the adversarial LLM aims to generate challenging prompts that elicit unsafe responses from the target LLM, while the target LLM is fine-tuned with safety aligned data on these adversarial prompts.

Instruction Following Response Generation

On the Equivalence of Graph Convolution and Mixup

no code implementations29 Sep 2023 Xiaotian Han, Hanqing Zeng, Yu Chen, Shaoliang Nie, Jingzhou Liu, Kanika Narang, Zahra Shakeri, Karthik Abinav Sankararaman, Song Jiang, Madian Khabsa, Qifan Wang, Xia Hu

We establish this equivalence mathematically by demonstrating that graph convolution networks (GCN) and simplified graph convolution (SGC) can be expressed as a form of Mixup.

Data Augmentation

Effective Long-Context Scaling of Foundation Models

1 code implementation27 Sep 2023 Wenhan Xiong, Jingyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz, Madian Khabsa, Han Fang, Yashar Mehdad, Sharan Narang, Kshitiz Malik, Angela Fan, Shruti Bhosale, Sergey Edunov, Mike Lewis, Sinong Wang, Hao Ma

We also examine the impact of various design choices in the pretraining process, including the data mix and the training curriculum of sequence lengths -- our ablation experiments suggest that having abundant long texts in the pretrain dataset is not the key to achieving strong performance, and we empirically verify that long context continual pretraining is more efficient and similarly effective compared to pretraining from scratch with long sequences.

Continual Pretraining Language Modelling

Residual Prompt Tuning: Improving Prompt Tuning with Residual Reparameterization

1 code implementation6 May 2023 Anastasia Razdaibiedina, Yuning Mao, Rui Hou, Madian Khabsa, Mike Lewis, Jimmy Ba, Amjad Almahairi

In this work, we introduce Residual Prompt Tuning - a simple and efficient method that significantly improves the performance and stability of prompt tuning.

MMViT: Multiscale Multiview Vision Transformers

no code implementations28 Apr 2023 Yuchen Liu, Natasha Ong, Kaiyan Peng, Bo Xiong, Qifan Wang, Rui Hou, Madian Khabsa, Kaiyue Yang, David Liu, Donald S. Williamson, Hanchao Yu

Our model encodes different views of the input signal and builds several channel-resolution feature stages to process the multiple views of the input at different resolutions in parallel.

Image Classification

SVT: Supertoken Video Transformer for Efficient Video Understanding

no code implementations1 Apr 2023 Chenbin Pan, Rui Hou, Hanchao Yu, Qifan Wang, Senem Velipasalar, Madian Khabsa

Whether by processing videos with fixed resolution from start to end or incorporating pooling and down-scaling strategies, existing video transformers process the whole video content throughout the network without specially handling the large portions of redundant information.

Video Understanding

Progressive Prompts: Continual Learning for Language Models

2 code implementations29 Jan 2023 Anastasia Razdaibiedina, Yuning Mao, Rui Hou, Madian Khabsa, Mike Lewis, Amjad Almahairi

We introduce Progressive Prompts - a simple and efficient approach for continual learning in language models.

Continual Learning

Logical Satisfiability of Counterfactuals for Faithful Explanations in NLI

no code implementations25 May 2022 Suzanna Sia, Anton Belyy, Amjad Almahairi, Madian Khabsa, Luke Zettlemoyer, Lambert Mathias

Evaluating an explanation's faithfulness is desired for many reasons such as trust, interpretability and diagnosing the sources of model's errors.

counterfactual

A Fistful of Words: Learning Transferable Visual Models from Bag-of-Words Supervision

no code implementations27 Dec 2021 Ajinkya Tejankar, Maziar Sanjabi, Bichen Wu, Saining Xie, Madian Khabsa, Hamed Pirsiavash, Hamed Firooz

In this paper, we focus on teasing out what parts of the language supervision are essential for training zero-shot image classification models.

Classification Image Captioning +3

Quantifying Adaptability in Pre-trained Language Models with 500 Tasks

2 code implementations NAACL 2022 Belinda Z. Li, Jane Yu, Madian Khabsa, Luke Zettlemoyer, Alon Halevy, Jacob Andreas

When a neural language model (LM) is adapted to perform a new task, what aspects of the task predict the eventual performance of the model?

Language Modelling Logical Reasoning +2

Sparse Distillation: Speeding Up Text Classification by Using Bigger Student Models

1 code implementation NAACL 2022 Qinyuan Ye, Madian Khabsa, Mike Lewis, Sinong Wang, Xiang Ren, Aaron Jaech

Distilling state-of-the-art transformer models into lightweight student models is an effective way to reduce computation cost at inference time.

Domain Generalization Privacy Preserving +4

UniPELT: A Unified Framework for Parameter-Efficient Language Model Tuning

1 code implementation ACL 2022 Yuning Mao, Lambert Mathias, Rui Hou, Amjad Almahairi, Hao Ma, Jiawei Han, Wen-tau Yih, Madian Khabsa

Recent parameter-efficient language model tuning (PELT) methods manage to match the performance of fine-tuning with much fewer trainable parameters and perform especially well when training data is limited.

Language Modelling Model Selection

Entailment as Few-Shot Learner

3 code implementations29 Apr 2021 Sinong Wang, Han Fang, Madian Khabsa, Hanzi Mao, Hao Ma

Large pre-trained language models (LMs) have demonstrated remarkable ability as few-shot learners.

Contrastive Learning Data Augmentation +8

On Unifying Misinformation Detection

1 code implementation NAACL 2021 Nayeon Lee, Belinda Z. Li, Sinong Wang, Pascale Fung, Hao Ma, Wen-tau Yih, Madian Khabsa

In this paper, we introduce UnifiedM2, a general-purpose misinformation model that jointly models multiple domains of misinformation with a single, unified setup.

Few-Shot Learning Misinformation

Towards Few-Shot Fact-Checking via Perplexity

no code implementations NAACL 2021 Nayeon Lee, Yejin Bang, Andrea Madotto, Madian Khabsa, Pascale Fung

Through experiments, we empirically verify the plausibility of the rather surprising usage of the perplexity score in the context of fact-checking and highlight the strength of our few-shot methodology by comparing it to strong fine-tuning-based baseline models.

Fact Checking Few-Shot Learning +5

To Pretrain or Not to Pretrain: Examining the Benefits of Pretrainng on Resource Rich Tasks

no code implementations ACL 2020 Sinong Wang, Madian Khabsa, Hao Ma

Pretraining NLP models with variants of Masked Language Model (MLM) objectives has recently led to a significant improvements on many tasks.

Language Modelling text-classification +1

To Pretrain or Not to Pretrain: Examining the Benefits of Pretraining on Resource Rich Tasks

no code implementations15 Jun 2020 Sinong Wang, Madian Khabsa, Hao Ma

Pretraining NLP models with variants of Masked Language Model (MLM) objectives has recently led to a significant improvements on many tasks.

Language Modelling text-classification +1

Linformer: Self-Attention with Linear Complexity

15 code implementations8 Jun 2020 Sinong Wang, Belinda Z. Li, Madian Khabsa, Han Fang, Hao Ma

Large transformer models have shown extraordinary success in achieving state-of-the-art results in many natural language processing applications.

Language Modelling

Language Models as Fact Checkers?

no code implementations WS 2020 Nayeon Lee, Belinda Z. Li, Sinong Wang, Wen-tau Yih, Hao Ma, Madian Khabsa

Recent work has suggested that language models (LMs) store both common-sense and factual knowledge learned from pre-training data.

Common Sense Reasoning Language Modelling +2

Keeping Notes: Conditional Natural Language Generation with a Scratchpad Encoder

no code implementations ACL 2019 Ryan Benmalek, Madian Khabsa, Suma Desu, Claire Cardie, Michele Banko

We introduce the Scratchpad Mechanism, a novel addition to the sequence-to-sequence (seq2seq) neural network architecture and demonstrate its effectiveness in improving the overall fluency of seq2seq models for natural language generation tasks.

Machine Translation Question Generation +4

Keeping Notes: Conditional Natural Language Generation with a Scratchpad Mechanism

1 code implementation12 Jun 2019 Ryan Y. Benmalek, Madian Khabsa, Suma Desu, Claire Cardie, Michele Banko

We introduce the Scratchpad Mechanism, a novel addition to the sequence-to-sequence (seq2seq) neural network architecture and demonstrate its effectiveness in improving the overall fluency of seq2seq models for natural language generation tasks.

Machine Translation Question Generation +4

Adversarial Training for Community Question Answer Selection Based on Multi-scale Matching

no code implementations22 Apr 2018 Xiao Yang, Miaosen Wang, Wei Wang, Madian Khabsa, Ahmed Awadallah, Daniel Kifer, C. Lee Giles

We frame this task as a binary (relevant/irrelevant) classification problem, and present an adversarial training framework to alleviate label imbalance issue.

Answer Selection General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.