Search Results for author: Rajat Hebbar

Found 11 papers, 4 papers with code

MM-AU:Towards Multimodal Understanding of Advertisement Videos

no code implementations27 Aug 2023 Digbalay Bose, Rajat Hebbar, Tiantian Feng, Krishna Somandepalli, Anfeng Xu, Shrikanth Narayanan

Advertisement videos (ads) play an integral part in the domain of Internet e-commerce as they amplify the reach of particular products to a broad audience or can serve as a medium to raise awareness about specific issues through concise narrative structures.

Robust Self Supervised Speech Embeddings for Child-Adult Classification in Interactions involving Children with Autism

no code implementations31 Jul 2023 Rimita Lahiri, Tiantian Feng, Rajat Hebbar, Catherine Lord, So Hyun Kim, Shrikanth Narayanan

We address the problem of detecting who spoke when in child-inclusive spoken interactions i. e., automatic child-adult speaker classification.

Classification

FedMultimodal: A Benchmark For Multimodal Federated Learning

no code implementations15 Jun 2023 Tiantian Feng, Digbalay Bose, Tuo Zhang, Rajat Hebbar, Anil Ramakrishna, Rahul Gupta, Mi Zhang, Salman Avestimehr, Shrikanth Narayanan

In order to facilitate the research in multimodal FL, we introduce FedMultimodal, the first FL benchmark for multimodal learning covering five representative multimodal applications from ten commonly used datasets with a total of eight unique modalities.

Emotion Recognition Federated Learning +1

Understanding Spoken Language Development of Children with ASD Using Pre-trained Speech Embeddings

no code implementations23 May 2023 Anfeng Xu, Rajat Hebbar, Rimita Lahiri, Tiantian Feng, Lindsay Butler, Lue Shen, Helen Tager-Flusberg, Shrikanth Narayanan

This paper proposes applications of speech processing technologies in support of automated assessment of children's spoken language development by classification between child and adult speech and between speech and nonverbal vocalization in NLS, with respective F1 macro scores of 82. 6% and 67. 8%, underscoring the potential for accurate and scalable tools for ASD research and clinical use.

Contextually-rich human affect perception using multimodal scene information

1 code implementation13 Mar 2023 Digbalay Bose, Rajat Hebbar, Krishna Somandepalli, Shrikanth Narayanan

The process of human affect understanding involves the ability to infer person specific emotional states from various sources including images, speech, and language.

A Review of Speech-centric Trustworthy Machine Learning: Privacy, Safety, and Fairness

no code implementations18 Dec 2022 Tiantian Feng, Rajat Hebbar, Nicholas Mehlman, Xuan Shi, Aditya Kommineni, and Shrikanth Narayanan

Speech-centric machine learning systems have revolutionized many leading domains ranging from transportation and healthcare to education and defense, profoundly changing how people live, work, and interact with each other.

Fairness

MovieCLIP: Visual Scene Recognition in Movies

1 code implementation20 Oct 2022 Digbalay Bose, Rajat Hebbar, Krishna Somandepalli, Haoyang Zhang, Yin Cui, Kree Cole-McLaughlin, Huisheng Wang, Shrikanth Narayanan

Longform media such as movies have complex narrative structures, with events spanning a rich variety of ambient visual scenes.

Genre classification Scene Recognition

Attribute Inference Attack of Speech Emotion Recognition in Federated Learning Settings

1 code implementation26 Dec 2021 Tiantian Feng, Hanieh Hashemi, Rajat Hebbar, Murali Annavaram, Shrikanth S. Narayanan

To assess the information leakage of SER systems trained using FL, we propose an attribute inference attack framework that infers sensitive attribute information of the clients from shared gradients or model parameters, corresponding to the FedSGD and the FedAvg training algorithms, respectively.

Attribute Federated Learning +2

Robust Character Labeling in Movie Videos: Data Resources and Self-supervised Feature Adaptation

no code implementations25 Aug 2020 Krishna Somandepalli, Rajat Hebbar, Shrikanth Narayanan

Our work in this paper focuses on two key aspects of this problem: the lack of domain-specific training or benchmark datasets, and adapting face embeddings learned on web images to long-form content, specifically movies.

Clustering Domain Adaptation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.