Search Results for author: Sumit Mukherjee

Found 10 papers, 1 papers with code

Assessment of Differentially Private Synthetic Data for Utility and Fairness in End-to-End Machine Learning Pipelines for Tabular Data

no code implementations • 30 Oct 2023 • Mayana Pereira, Meghana Kshirsagar, Sumit Mukherjee, Rahul Dodhia, Juan Lavista Ferres, Rafael de Sousa

To the best of our knowledge, our work is the first that: (i) proposes a training and evaluation framework that does not assume that real data is available for testing the utility and fairness of machine learning models trained on synthetic data; (ii) presents the most extensive analysis of synthetic data set generation algorithms in terms of utility and fairness when used for training machine learning models; and (iii) encompasses several different definitions of fairness.

Fairness Humanitarian +1

Paper
Add Code

A Mean Field Approach to Empirical Bayes Estimation in High-dimensional Linear Regression

no code implementations • 28 Sep 2023 • Sumit Mukherjee, Bodhisattva Sen, Subhabrata Sen

We study empirical Bayes estimation in high-dimensional linear regression.

Bayesian Inference regression

Paper
Add Code

An Analysis of the Deployment of Models Trained on Private Tabular Synthetic Data: Unexpected Surprises

no code implementations • 15 Jun 2021 • Mayana Pereira, Meghana Kshirsagar, Sumit Mukherjee, Rahul Dodhia, Juan Lavista Ferres

Diferentially private (DP) synthetic datasets are a powerful approach for training machine learning models while respecting the privacy of individual data providers.

Fairness Synthetic Data Generation

Paper
Add Code

A machine learning pipeline for aiding school identification from child trafficking images

no code implementations • 9 Jun 2021 • Sumit Mukherjee, Tina Sederholm, Anthony C. Roman, Ria Sankar, Sherrie Caltagirone, Juan Lavista Ferres

Child trafficking in a serious problem around the world.

BIG-bench Machine Learning

Paper
Add Code

Variational Inference in high-dimensional linear regression

no code implementations • 25 Apr 2021 • Sumit Mukherjee, Subhabrata Sen

Using the nascent theory of non-linear large deviations (Chatterjee and Dembo, 2016), we derive sufficient conditions for the leading-order correctness of the naive mean-field approximation to the log-normalizing constant of the posterior distribution.

regression Variational Inference +1

Paper
Add Code

Reducing bias and increasing utility by federated generative modeling of medical images using a centralized adversary

no code implementations • 18 Jan 2021 • Jean-Francois Rajotte, Sumit Mukherjee, Caleb Robinson, Anthony Ortiz, Christopher West, Juan Lavista Ferres, Raymond T Ng

We show that by using the FELICIA mechanism, a data owner with limited image samples can generate high-quality synthetic images with high utility while neither data owners has to provide access to its data.

Federated Learning Lesion Classification +1

Paper
Add Code

Detecting Structured Signals in Ising Models

no code implementations • 10 Dec 2020 • Nabarun Deb, Rajarshi Mukherjee, Sumit Mukherjee, Ming Yuan

In this paper, we study the effect of dependence on detecting a class of signals in Ising models, where the signals are present in a structured way.

Probability Statistics Theory Statistics Theory 62G10, 62G20, 62C20

Paper
Add Code

MACE: A Flexible Framework for Membership Privacy Estimation in Generative Models

no code implementations • 11 Sep 2020 • Yixi Xu, Sumit Mukherjee, Xiyang Liu, Shruti Tople, Rahul Dodhia, Juan Lavista Ferres

In this work, we propose the first formal framework for membership privacy estimation in generative models.

Paper
Add Code

privGAN: Protecting GANs from membership inference attacks at low cost

1 code implementation • 31 Dec 2019 • Sumit Mukherjee, Yixi Xu, Anusua Trivedi, Juan Lavista Ferres

It has been shown that such synthetic data can be used for a variety of downstream tasks such as training classifiers that would otherwise require the original dataset to be shared.

Privacy Preserving

Paper
Code

Risks of Using Non-verified Open Data: A case study on using Machine Learning techniques for predicting Pregnancy Outcomes in India

no code implementations • 4 Oct 2019 • Anusua Trivedi, Sumit Mukherjee, Edmund Tse, Anne Ewing, Juan Lavista Ferres

As a result, we can often end up using data that is not representative of the problem we are trying to solve.

Marketing

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.