Search Results for author: Sumit Mukherjee

Found 10 papers, 1 papers with code

Assessment of Differentially Private Synthetic Data for Utility and Fairness in End-to-End Machine Learning Pipelines for Tabular Data

no code implementations30 Oct 2023 Mayana Pereira, Meghana Kshirsagar, Sumit Mukherjee, Rahul Dodhia, Juan Lavista Ferres, Rafael de Sousa

To the best of our knowledge, our work is the first that: (i) proposes a training and evaluation framework that does not assume that real data is available for testing the utility and fairness of machine learning models trained on synthetic data; (ii) presents the most extensive analysis of synthetic data set generation algorithms in terms of utility and fairness when used for training machine learning models; and (iii) encompasses several different definitions of fairness.

Fairness Humanitarian +1

An Analysis of the Deployment of Models Trained on Private Tabular Synthetic Data: Unexpected Surprises

no code implementations15 Jun 2021 Mayana Pereira, Meghana Kshirsagar, Sumit Mukherjee, Rahul Dodhia, Juan Lavista Ferres

Diferentially private (DP) synthetic datasets are a powerful approach for training machine learning models while respecting the privacy of individual data providers.

Fairness Synthetic Data Generation

Variational Inference in high-dimensional linear regression

no code implementations25 Apr 2021 Sumit Mukherjee, Subhabrata Sen

Using the nascent theory of non-linear large deviations (Chatterjee and Dembo, 2016), we derive sufficient conditions for the leading-order correctness of the naive mean-field approximation to the log-normalizing constant of the posterior distribution.

regression Variational Inference +1

Reducing bias and increasing utility by federated generative modeling of medical images using a centralized adversary

no code implementations18 Jan 2021 Jean-Francois Rajotte, Sumit Mukherjee, Caleb Robinson, Anthony Ortiz, Christopher West, Juan Lavista Ferres, Raymond T Ng

We show that by using the FELICIA mechanism, a data owner with limited image samples can generate high-quality synthetic images with high utility while neither data owners has to provide access to its data.

Federated Learning Lesion Classification +1

Detecting Structured Signals in Ising Models

no code implementations10 Dec 2020 Nabarun Deb, Rajarshi Mukherjee, Sumit Mukherjee, Ming Yuan

In this paper, we study the effect of dependence on detecting a class of signals in Ising models, where the signals are present in a structured way.

Probability Statistics Theory Statistics Theory 62G10, 62G20, 62C20

MACE: A Flexible Framework for Membership Privacy Estimation in Generative Models

no code implementations11 Sep 2020 Yixi Xu, Sumit Mukherjee, Xiyang Liu, Shruti Tople, Rahul Dodhia, Juan Lavista Ferres

In this work, we propose the first formal framework for membership privacy estimation in generative models.

privGAN: Protecting GANs from membership inference attacks at low cost

1 code implementation31 Dec 2019 Sumit Mukherjee, Yixi Xu, Anusua Trivedi, Juan Lavista Ferres

It has been shown that such synthetic data can be used for a variety of downstream tasks such as training classifiers that would otherwise require the original dataset to be shared.

Privacy Preserving

Cannot find the paper you are looking for? You can Submit a new open access paper.