Search Results for author: Kabir Ahuja

Found 15 papers, 7 papers with code

Learning Syntax Without Planting Trees: Understanding When and Why Transformers Generalize Hierarchically

no code implementations25 Apr 2024 Kabir Ahuja, Vidhisha Balachandran, Madhur Panwar, Tianxing He, Noah A. Smith, Navin Goyal, Yulia Tsvetkov

Transformers trained on natural language data have been shown to learn its hierarchical structure and generalize to sentences with unseen syntactic structures without explicitly encoding any structural bias.

Inductive Bias Language Modelling

On Evaluating and Mitigating Gender Biases in Multilingual Settings

no code implementations4 Jul 2023 Aniket Vashishtha, Kabir Ahuja, Sunayana Sitaram

While understanding and removing gender biases in language models has been a long-standing problem in Natural Language Processing, prior research work has primarily been limited to English.

In-Context Learning through the Bayesian Prism

1 code implementation8 Jun 2023 Madhur Panwar, Kabir Ahuja, Navin Goyal

One of the main discoveries in this line of research has been that for several function classes, such as linear regression, transformers successfully generalize to new functions in the class.

Bayesian Inference In-Context Learning +4

Breaking Language Barriers with a LEAP: Learning Strategies for Polyglot LLMs

no code implementations28 May 2023 Akshay Nambi, Vaibhav Balloli, Mercy Ranjit, Tanuja Ganu, Kabir Ahuja, Sunayana Sitaram, Kalika Bali

Our results show substantial advancements in multilingual understanding and generation across a diverse range of languages.

Question Answering Retrieval

MEGA: Multilingual Evaluation of Generative AI

1 code implementation22 Mar 2023 Kabir Ahuja, Harshita Diddee, Rishav Hada, Millicent Ochieng, Krithika Ramesh, Prachi Jain, Akshay Nambi, Tanuja Ganu, Sameer Segal, Maxamed Axmed, Kalika Bali, Sunayana Sitaram

Most studies on generative LLMs have been restricted to English and it is unclear how capable these models are at understanding and generating text in other languages.

Benchmarking

On the Calibration of Massively Multilingual Language Models

1 code implementation21 Oct 2022 Kabir Ahuja, Sunayana Sitaram, Sandipan Dandapat, Monojit Choudhury

Massively Multilingual Language Models (MMLMs) have recently gained popularity due to their surprising effectiveness in cross-lingual transfer.

Cross-Lingual Transfer

Multi Task Learning For Zero Shot Performance Prediction of Multilingual Models

no code implementations ACL 2022 Kabir Ahuja, Shanu Kumar, Sandipan Dandapat, Monojit Choudhury

Massively Multilingual Transformer based Language Models have been observed to be surprisingly effective on zero-shot transfer across languages, though the performance varies from language to language depending on the pivot language(s) used for fine-tuning.

feature selection Multi-Task Learning

Beyond Static Models and Test Sets: Benchmarking the Potential of Pre-trained Models Across Tasks and Languages

no code implementations nlppower (ACL) 2022 Kabir Ahuja, Sandipan Dandapat, Sunayana Sitaram, Monojit Choudhury

Although recent Massively Multilingual Language Models (MMLMs) like mBERT and XLMR support around 100 languages, most existing multilingual NLP benchmarks provide evaluation data in only a handful of these languages with little linguistic diversity.

Benchmarking Multilingual NLP +1

On the Economics of Multilingual Few-shot Learning: Modeling the Cost-Performance Trade-offs of Machine Translated and Manual Data

no code implementations NAACL 2022 Kabir Ahuja, Monojit Choudhury, Sandipan Dandapat

Borrowing ideas from {\em Production functions} in micro-economics, in this paper we introduce a framework to systematically evaluate the performance and cost trade-offs between machine-translated and manually-created labelled data for task-specific fine-tuning of massively multilingual language models.

Few-Shot Learning Machine Translation +1

On the Practical Ability of Recurrent Neural Networks to Recognize Hierarchical Languages

1 code implementation COLING 2020 Satwik Bhattamishra, Kabir Ahuja, Navin Goyal

We find that while recurrent models generalize nearly perfectly if the lengths of the training and test strings are from the same range, they perform poorly if the test strings are longer.

On the Ability and Limitations of Transformers to Recognize Formal Languages

1 code implementation EMNLP 2020 Satwik Bhattamishra, Kabir Ahuja, Navin Goyal

Our analysis also provides insights on the role of self-attention mechanism in modeling certain behaviors and the influence of positional encoding schemes on the learning and generalization abilities of the model.

Cannot find the paper you are looking for? You can Submit a new open access paper.