Search Results for author: Saleh Soltan

Found 8 papers, 1 papers with code

Limitations of Knowledge Distillation for Zero-shot Transfer Learning

no code implementations EMNLP (sustainlp) 2021 Saleh Soltan, Haidar Khan, Wael Hamza

We demonstrate that in contradiction to the previous observation in the case of monolingual distillation, in multilingual settings, distillation during pretraining is more effective than distillation during fine-tuning for zero-shot transfer learning.

Knowledge Distillation Transfer Learning +1

GeMQuAD : Generating Multilingual Question Answering Datasets from Large Language Models using Few Shot Learning

no code implementations14 Apr 2024 Amani Namboori, Shivam Mangale, Andy Rosenbaum, Saleh Soltan

The emergence of Large Language Models (LLMs) with capabilities like In-Context Learning (ICL) has ushered in new possibilities for data generation across various domains while minimizing the need for extensive data collection and modeling techniques.

Extractive Question-Answering Few-Shot Learning +4

Recipes for Sequential Pre-training of Multilingual Encoder and Seq2Seq Models

no code implementations14 Jun 2023 Saleh Soltan, Andy Rosenbaum, Tobias Falke, Qin Lu, Anna Rumshisky, Wael Hamza

(2) Conversely, using an encoder to warm-start seq2seq training, we show that by unfreezing the encoder partway through training, we can match task performance of a from-scratch seq2seq model.

Language Modelling Masked Language Modeling

CLASP: Few-Shot Cross-Lingual Data Augmentation for Semantic Parsing

no code implementations13 Oct 2022 Andy Rosenbaum, Saleh Soltan, Wael Hamza, Amir Saffari, Marco Damonte, Isabel Groves

A bottleneck to developing Semantic Parsing (SP) models is the need for a large volume of human-labeled training data.

Data Augmentation Semantic Parsing

LINGUIST: Language Model Instruction Tuning to Generate Annotated Utterances for Intent Classification and Slot Tagging

no code implementations COLING 2022 Andy Rosenbaum, Saleh Soltan, Wael Hamza, Yannick Versley, Markus Boese

We present LINGUIST, a method for generating annotated data for Intent Classification and Slot Tagging (IC+ST), via fine-tuning AlexaTM 5B, a 5-billion-parameter multilingual sequence-to-sequence (seq2seq) model, on a flexible instruction prompt.

intent-classification Intent Classification +3

AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model

1 code implementation2 Aug 2022 Saleh Soltan, Shankar Ananthakrishnan, Jack FitzGerald, Rahul Gupta, Wael Hamza, Haidar Khan, Charith Peris, Stephen Rawls, Andy Rosenbaum, Anna Rumshisky, Chandana Satya Prakash, Mukund Sridhar, Fabian Triefenbach, Apurv Verma, Gokhan Tur, Prem Natarajan

In this work, we demonstrate that multilingual large-scale sequence-to-sequence (seq2seq) models, pre-trained on a mixture of denoising and Causal Language Modeling (CLM) tasks, are more efficient few-shot learners than decoder-only models on various tasks.

Causal Language Modeling Common Sense Reasoning +8

Cannot find the paper you are looking for? You can Submit a new open access paper.