Search Results for author: Sreyan Ghosh

Found 33 papers, 23 papers with code

Span Extraction Aided Improved Code-mixed Sentiment Classification

1 code implementation • COLING (WNUT) 2022 • Ramaneswaran S, Sean Benhur, Sreyan Ghosh

Sentiment classification is a fundamental NLP task of detecting the sentiment polarity of a given text.

Paper
Code

CoDa: Constrained Generation based Data Augmentation for Low-Resource NLP

no code implementations • 30 Mar 2024 • Chandra Kiran Reddy Evuru, Sreyan Ghosh, Sonal Kumar, Ramaneswaran S, Utkarsh Tyagi, Dinesh Manocha

We present CoDa (Constrained Generation based Data Augmentation), a controllable, effective, and training-free data augmentation technique for low-resource (data-scarce) NLP.

Data Augmentation Instruction Following

Paper
Add Code

Do Vision-Language Models Understand Compound Nouns?

no code implementations • 30 Mar 2024 • Sonal Kumar, Sreyan Ghosh, S Sakshi, Utkarsh Tyagi, Dinesh Manocha

We curate Compun, a novel benchmark with 400 unique and commonly used CNs, to evaluate the effectiveness of VLMs in interpreting CNs.

Image Retrieval Language Modelling +2

Paper
Add Code

A Closer Look at the Limitations of Instruction Tuning

no code implementations • 3 Feb 2024 • Sreyan Ghosh, Chandra Kiran Reddy Evuru, Sonal Kumar, Ramaneswaran S, Deepali Aneja, Zeyu Jin, Ramani Duraiswami, Dinesh Manocha

Our findings reveal that responses generated solely from pre-trained knowledge consistently outperform responses by models that learn any form of new knowledge from IT on open-source datasets.

Hallucination

Paper
Add Code

Stable Distillation: Regularizing Continued Pre-training for Low-Resource Automatic Speech Recognition

1 code implementation • 20 Dec 2023 • Ashish Seth, Sreyan Ghosh, S. Umesh, Dinesh Manocha

Specifically, first, we perform vanilla continued pre-training on an initial SSL pre-trained model on the target domain ASR dataset and call it the teacher.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Code

FusDom: Combining In-Domain and Out-of-Domain Knowledge for Continuous Self-Supervised Learning

1 code implementation • 20 Dec 2023 • Ashish Seth, Sreyan Ghosh, S. Umesh, Dinesh Manocha

Continued pre-training (CP) offers multiple advantages, like target domain adaptation and the potential to exploit the continuous stream of unlabeled data available online.

Domain Adaptation Self-Supervised Learning

Paper
Code

AV-RIR: Audio-Visual Room Impulse Response Estimation

no code implementations • 30 Nov 2023 • Anton Ratnarajah, Sreyan Ghosh, Sonal Kumar, Purva Chiniya, Dinesh Manocha

We propose AV-RIR, a novel multi-modal multi-task learning approach to accurately estimate the RIR from a given reverberant speech signal and the visual cues of its corresponding environment.

Multi-Task Learning Room Impulse Response (RIR) +1

Paper
Add Code

DALE: Generative Data Augmentation for Low-Resource Legal NLP

1 code implementation • 24 Oct 2023 • Sreyan Ghosh, Chandra Kiran Evuru, Sonal Kumar, S Ramaneswaran, S Sakshi, Utkarsh Tyagi, Dinesh Manocha

We present DALE, a novel and effective generative Data Augmentation framework for low-resource LEgal NLP.

Data Augmentation Denoising +2

Paper
Code

CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models

no code implementations • 12 Oct 2023 • Sreyan Ghosh, Ashish Seth, Sonal Kumar, Utkarsh Tyagi, Chandra Kiran Evuru, S. Ramaneswaran, S. Sakshi, Oriol Nieto, Ramani Duraiswami, Dinesh Manocha

In this paper, we propose CompA, a collection of two expert-annotated benchmarks with a majority of real-world audio samples, to evaluate compositional reasoning in ALMs.

Attribute Audio Classification +1

Paper
Add Code

RECAP: Retrieval-Augmented Audio Captioning

1 code implementation • 18 Sep 2023 • Sreyan Ghosh, Sonal Kumar, Chandra Kiran Reddy Evuru, Ramani Duraiswami, Dinesh Manocha

We present RECAP (REtrieval-Augmented Audio CAPtioning), a novel and effective audio captioning system that generates captions conditioned on an input audio and other captions similar to the audio retrieved from a datastore.

AudioCaps Audio captioning +2

Paper
Code

AdVerb: Visually Guided Audio Dereverberation

no code implementations • ICCV 2023 • Sanjoy Chowdhury, Sreyan Ghosh, Subhrajyoti Dasgupta, Anton Ratnarajah, Utkarsh Tyagi, Dinesh Manocha

We present AdVerb, a novel audio-visual dereverberation framework that uses visual cues in addition to the reverberant sound to estimate clean audio.

Speaker Verification Speech Enhancement +2

Paper
Add Code

ASPIRE: Language-Guided Augmentation for Robust Image Classification

no code implementations • 19 Aug 2023 • Sreyan Ghosh, Chandra Kiran Reddy Evuru, Sonal Kumar, Utkarsh Tyagi, Sakshi Singh, Sanjoy Chowdhury, Dinesh Manocha

This paper presents ASPIRE (Language-guided data Augmentation for SPurIous correlation REmoval), a simple yet effective solution for expanding the training dataset with synthetic images without spurious features.

Classification Data Augmentation +2

Paper
Add Code

ACLM: A Selective-Denoising based Generative Data Augmentation Approach for Low-Resource Complex NER

1 code implementation • 1 Jun 2023 • Sreyan Ghosh, Utkarsh Tyagi, Manan Suri, Sonal Kumar, S Ramaneswaran, Dinesh Manocha

In addition, we demonstrate the application of ACLM to other domains that suffer from data scarcity (e. g., biomedical).

Data Augmentation Denoising +5

Paper
Code

BioAug: Conditional Generation based Data Augmentation for Low-Resource Biomedical NER

1 code implementation • 18 May 2023 • Sreyan Ghosh, Utkarsh Tyagi, Sonal Kumar, Dinesh Manocha

Though data augmentation has shown to be highly effective for low-resource NER in general, existing data augmentation techniques fail to produce factual and diverse augmentations for BioNER.

Data Augmentation named-entity-recognition +2

Paper
Code

UNFUSED: UNsupervised Finetuning Using SElf supervised Distillation

1 code implementation • 10 Mar 2023 • Ashish Seth, Sreyan Ghosh, S. Umesh, Dinesh Manocha

Unlike prior works, which directly fine-tune a self-supervised pre-trained encoder on a target dataset, we use the encoder to generate pseudo-labels for unsupervised fine-tuning before the actual fine-tuning step.

Audio Classification Self-Supervised Learning

Paper
Code

CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a Context Synergized Hyperbolic Network

1 code implementation • 2 Mar 2023 • Sreyan Ghosh, Manan Suri, Purva Chiniya, Utkarsh Tyagi, Sonal Kumar, Dinesh Manocha

The tremendous growth of social media users interacting in online conversations has led to significant growth in hate speech, affecting people from various demographics.

Paper
Code

A novel multimodal dynamic fusion network for disfluency detection in spoken utterances

no code implementations • 27 Nov 2022 • Sreyan Ghosh, Utkarsh Tyagi, Sonal Kumar, Manan Suri, Rajiv Ratn Shah

Based on early-fusion and self-attention-based multimodal interaction between text and acoustic modalities, in this paper, we propose a novel multimodal architecture for disfluency detection from individual utterances.

Paper
Add Code

data2vec-aqc: Search for the right Teaching Assistant in the Teacher-Student training setup

1 code implementation • 2 Nov 2022 • Vasista Sai Lodagala, Sreyan Ghosh, S. Umesh

In this paper, we propose a new Self-Supervised Learning (SSL) algorithm called data2vec-aqc, for speech representation learning from unlabeled speech data.

Automatic Speech Recognition (ASR) Representation Learning +1

Paper
Code

SLICER: Learning universal audio representations using low-resource self-supervised pre-training

1 code implementation • 2 Nov 2022 • Ashish Seth, Sreyan Ghosh, S. Umesh, Dinesh Manocha

We present a new Self-Supervised Learning (SSL) approach to pre-train encoders on unlabeled audio data that reduces the need for large amounts of labeled data for audio and speech classification.

Audio Classification Clustering +3

Paper
Code

MAST: Multiscale Audio Spectrogram Transformers

1 code implementation • 2 Nov 2022 • Sreyan Ghosh, Ashish Seth, S. Umesh, Dinesh Manocha

We present Multiscale Audio Spectrogram Transformer (MAST) for audio classification, which brings the concept of multiscale feature hierarchies to the Audio Spectrogram Transformer (AST).

Audio Classification Keyword Spotting +1

Paper
Code

CCC-wav2vec 2.0: Clustering aided Cross Contrastive Self-supervised learning of speech representations

1 code implementation • 5 Oct 2022 • Vasista Sai Lodagala, Sreyan Ghosh, S. Umesh

While Self-Supervised Learning has helped reap the benefit of the scale from the available unlabeled data, the learning paradigms are continuously being bettered.

Automatic Speech Recognition (ASR) Clustering +2

Paper
Code

PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech Representations

1 code implementation • 31 Mar 2022 • Lodagala V S V Durga Prasad, Sreyan Ghosh, S. Umesh

To alleviate this issue, we propose PADA (Pruning Assisted Domain Adaptation) and zero out redundant weights from models pre-trained on large amounts of out-of-domain (OOD) data.

Domain Adaptation Language Modelling +1

Paper
Code

MMER: Multimodal Multi-task Learning for Speech Emotion Recognition

1 code implementation • 31 Mar 2022 • Sreyan Ghosh, Utkarsh Tyagi, S Ramaneswaran, Harshvardhan Srivastava, Dinesh Manocha

In this paper, we propose MMER, a novel Multimodal Multi-task learning approach for Speech Emotion Recognition.

Ranked #2 on Speech Emotion Recognition on IEMOCAP (using extra training data)

Multi-Task Learning Speech Emotion Recognition

Paper
Code

Analyzing the factors affecting usefulness of Self-Supervised Pre-trained Representations for Speech Recognition

no code implementations • 31 Mar 2022 • Ashish Seth, Lodagala V S V Durga Prasad, Sreyan Ghosh, S. Umesh

Self-supervised learning (SSL) to learn high-level speech representations has been a popular approach to building Automatic Speech Recognition (ASR) systems in low-resource settings.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

M-MELD: A Multilingual Multi-Party Dataset for Emotion Recognition in Conversations

1 code implementation • 31 Mar 2022 • Sreyan Ghosh, S Ramaneswaran, Utkarsh Tyagi, Harshvardhan Srivastava, Samden Lepcha, S Sakshi, Dinesh Manocha

Expression of emotions is a crucial part of daily human communication.

Emotion Recognition

Paper
Code

Span Classification with Structured Information for Disfluency Detection in Spoken Utterances

1 code implementation • 30 Mar 2022 • Sreyan Ghosh, Sonal Kumar, Yaman Kumar Singla, Rajiv Ratn Shah, S. Umesh

Existing approaches in disfluency detection focus on solving a token-level classification task for identifying and removing disfluencies in text.

Classification

Paper
Code

DeLoRes: Decorrelating Latent Spaces for Low-Resource Audio Representation Learning

1 code implementation • 25 Mar 2022 • Sreyan Ghosh, Ashish Seth, and Deepak Mittal, Maneesh Singh, S. Umesh

Inspired by the recent progress in self-supervised learning for computer vision, in this paper we introduce DeLoRes, a new general-purpose audio representation learning approach.

Representation Learning Self-Supervised Learning +1

Paper
Code

Leveraging Transformers for Hate Speech Detection in Conversational Code-Mixed Tweets

no code implementations • 18 Dec 2021 • Zaki Mustafa Farooqi, Sreyan Ghosh, Rajiv Ratn Shah

In the current era of the internet, where social media platforms are easily accessible for everyone, people often have to deal with threats, identity attacks, hate, and bullying due to their association with a cast, creed, gender, religion, or even acceptance or rejection of a notion.

Hate Speech Detection

Paper
Add Code

DECAR: Deep Clustering for learning general-purpose Audio Representations

1 code implementation • 17 Oct 2021 • Sreyan Ghosh, Sandesh V Katta, Ashish Seth, S. Umesh

We introduce DECAR, a self-supervised pre-training approach for learning general-purpose audio representations.

Clustering Deep Clustering +2

Paper
Code

DeToxy: A Large-Scale Multimodal Dataset for Toxicity Classification in Spoken Utterances

1 code implementation • 14 Oct 2021 • Sreyan Ghosh, Samden Lepcha, S Sakshi, Rajiv Ratn Shah, S. Umesh

We believe that our dataset would act as a benchmark for the relatively new and un-explored Spoken Language Processing task of detecting toxicity from spoken utterances and boost further research in this space.

Paper
Code

Cisco at SemEval-2021 Task 5: What's Toxic?: Leveraging Transformers for Multiple Toxic Span Extraction from Online Comments

1 code implementation • SEMEVAL 2021 • Sreyan Ghosh, Sonal Kumar

We also explore a dependency parsing approach where we extract spans from the input sentence under the supervision of target span boundaries and rank our spans using a biaffine model.

Attribute Binary Classification +5

Paper
Code

Cisco at AAAI-CAD21 shared task: Predicting Emphasis in Presentation Slides using Contextualized Embeddings

1 code implementation • 10 Jan 2021 • Sreyan Ghosh, Sonal Kumar, Harsh Jalan, Hemant Yadav, Rajiv Ratn Shah

This paper describes our proposed system for the AAAI-CAD21 shared task: Predicting Emphasis in Presentation Slides.

Paper
Code

End-to-end Named Entity Recognition from English Speech

1 code implementation • 22 May 2020 • Hemant Yadav, Sreyan Ghosh, Yi Yu, Rajiv Ratn Shah

Named entity recognition (NER) from text has been a widely studied problem and usually extracts semantic information from text.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.