Search Results for author: Harshita Diddee

Found 8 papers, 4 papers with code

''Fifty Shades of Bias'': Normative Ratings of Gender Bias in GPT Generated English Text

no code implementations • 26 Oct 2023 • Rishav Hada, Agrima Seth, Harshita Diddee, Kalika Bali

Next, we systematically analyze the variation of themes of gender biases in the observed ranking and show that identity-attack is most closely related to gender bias.

Binary Classification Text Generation

Paper
Add Code

Are Large Language Model-based Evaluators the Solution to Scaling Up Multilingual Evaluation?

no code implementations • 14 Sep 2023 • Rishav Hada, Varun Gumma, Adrian de Wynter, Harshita Diddee, Mohamed Ahmed, Monojit Choudhury, Kalika Bali, Sunayana Sitaram

Large Language Models (LLMs) excel in various Natural Language Processing (NLP) tasks, yet their evaluation, particularly in languages beyond the top $20$, remains inadequate due to existing benchmarks and metrics limitations.

Language Modelling Large Language Model +2

Paper
Add Code

MEGA: Multilingual Evaluation of Generative AI

1 code implementation • 22 Mar 2023 • Kabir Ahuja, Harshita Diddee, Rishav Hada, Millicent Ochieng, Krithika Ramesh, Prachi Jain, Akshay Nambi, Tanuja Ganu, Sameer Segal, Maxamed Axmed, Kalika Bali, Sunayana Sitaram

Most studies on generative LLMs have been restricted to English and it is unclear how capable these models are at understanding and generating text in other languages.

Benchmarking

Paper
Code

Learnings from Technological Interventions in a Low Resource Language: Enhancing Information Access in Gondi

1 code implementation • 29 Nov 2022 • Devansh Mehta, Harshita Diddee, Ananya Saxena, Anurag Shukla, Sebastin Santy, Ramaravind Kommiya Mothilal, Brij Mohan Lal Srivastava, Alok Sharma, Vishnu Prasad, Venkanna U, Kalika Bali

The primary obstacle to developing technologies for low-resource languages is the lack of representative, usable data.

Machine Translation Translation

Paper
Code

Too Brittle To Touch: Comparing the Stability of Quantization and Distillation Towards Developing Lightweight Low-Resource MT Models

1 code implementation • 27 Oct 2022 • Harshita Diddee, Sandipan Dandapat, Monojit Choudhury, Tanuja Ganu, Kalika Bali

Leveraging shared learning through Massively Multilingual Models, state-of-the-art machine translation models are often able to adapt to the paucity of data for low-resource languages.

Knowledge Distillation Machine Translation +1

Paper
Code

Towards Quantifying the Carbon Emissions of Differentially Private Machine Learning

no code implementations • 14 Jul 2021 • Rakshit Naidu, Harshita Diddee, Ajinkya Mulay, Aleti Vardhan, Krithika Ramesh, Ahmed Zamzam

In recent years, machine learning techniques utilizing large-scale datasets have achieved remarkable performance.

BIG-bench Machine Learning

Paper
Add Code

Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic Languages

1 code implementation • 12 Apr 2021 • Gowtham Ramesh, Sumanth Doddapaneni, Aravinth Bheemaraj, Mayank Jobanputra, Raghavan AK, Ajitesh Sharma, Sujit Sahoo, Harshita Diddee, Mahalakshmi J, Divyanshu Kakwani, Navneet Kumar, Aswin Pradeep, Srihari Nagaraj, Kumar Deepak, Vivek Raghavan, Anoop Kunchukuttan, Pratyush Kumar, Mitesh Shantadevi Khapra

We mine the parallel sentences from the web by combining many corpora, tools, and methods: (a) web-crawled monolingual corpora, (b) document OCR for extracting sentences from scanned documents, (c) multilingual representation models for aligning sentences, and (d) approximate nearest neighbor search for searching in a large collection of sentences.

Machine Translation Multilingual NLP +3

108

Paper
Code

PsuedoProp at SemEval-2020 Task 11: Propaganda Span Detection Using BERT-CRF and Ensemble Sentence Level Classifier

no code implementations • SEMEVAL 2020 • Aniruddha Chauhan, Harshita Diddee

This paper explains our teams{'} submission to the Shared Task of Fine-Grained Propaganda Detection in which we propose a sequential BERT-CRF based Span Identification model where the fine-grained detection is carried out only on the articles that are flagged as containing propaganda by an ensemble SLC model.

Propaganda detection Sentence

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.