Search Results for author: Gautam Kamath

Found 54 papers, 18 papers with code

Disguised Copyright Infringement of Latent Diffusion Models

no code implementations10 Apr 2024 Yiwei Lu, Matthew Y. R. Yang, Zuoqiu Liu, Gautam Kamath, YaoLiang Yu

Copyright infringement may occur when a generative model produces samples substantially similar to some copyrighted data that it had access to during the training phase.

Indiscriminate Data Poisoning Attacks on Pre-trained Feature Extractors

no code implementations20 Feb 2024 Yiwei Lu, Matthew Y. R. Yang, Gautam Kamath, YaoLiang Yu

In this paper, we extend the exploration of the threat of indiscriminate attacks on downstream tasks that apply pre-trained feature extractors.

Data Poisoning Domain Adaptation +2

Not All Learnable Distribution Classes are Privately Learnable

no code implementations1 Feb 2024 Mark Bun, Gautam Kamath, Argyris Mouzakis, Vikrant Singhal

We give an example of a class of distributions that is learnable in total variation distance with a finite number of samples, but not learnable under $(\varepsilon, \delta)$-differential privacy.

Exploring the Limits of Model-Targeted Indiscriminate Data Poisoning Attacks

1 code implementation7 Mar 2023 Yiwei Lu, Gautam Kamath, YaoLiang Yu

Building on existing parameter corruption attacks and refining the Gradient Canceling attack, we perform extensive experiments to confirm our theoretical findings, test the predictability of our transition threshold, and significantly improve existing indiscriminate data poisoning baselines over a range of datasets and models.

Data Poisoning Model Poisoning

Choosing Public Datasets for Private Machine Learning via Gradient Subspace Distance

no code implementations2 Mar 2023 Xin Gu, Gautam Kamath, Zhiwei Steven Wu

We give an algorithm for selecting a public dataset by measuring a low-dimensional subspace distance between gradients of the public and private examples.

Private GANs, Revisited

1 code implementation6 Feb 2023 Alex Bie, Gautam Kamath, Guojun Zhang

We show that the canonical approach for training differentially private GANs -- updating the discriminator with differentially private stochastic gradient descent (DPSGD) -- can yield significantly improved results after modifications to training.

Image Generation

A Bias-Variance-Privacy Trilemma for Statistical Estimation

no code implementations30 Jan 2023 Gautam Kamath, Argyris Mouzakis, Matthew Regehr, Vikrant Singhal, Thomas Steinke, Jonathan Ullman

The canonical algorithm for differentially private mean estimation is to first clip the samples to a bounded range and then add noise to their empirical mean.

Hidden Poison: Machine Unlearning Enables Camouflaged Poisoning Attacks

1 code implementation NeurIPS 2023 Jimmy Z. Di, Jack Douglas, Jayadev Acharya, Gautam Kamath, Ayush Sekhari

We introduce camouflaged data poisoning attacks, a new attack vector that arises in the context of machine unlearning and other settings when model retraining may be induced.

Data Poisoning Machine Unlearning

Considerations for Differentially Private Learning with Large-Scale Public Pretraining

2 code implementations13 Dec 2022 Florian Tramèr, Gautam Kamath, Nicholas Carlini

The performance of differentially private machine learning can be boosted significantly by leveraging the transfer learning capabilities of non-private models pretrained on large public datasets.

Privacy Preserving Transfer Learning

Robustness Implies Privacy in Statistical Estimation

no code implementations9 Dec 2022 Samuel B. Hopkins, Gautam Kamath, Mahbod Majid, Shyam Narayanan

We study the relationship between adversarial robustness and differential privacy in high-dimensional algorithmic statistics.

Adversarial Robustness

Private Estimation with Public Data

1 code implementation16 Aug 2022 Alex Bie, Gautam Kamath, Vikrant Singhal

We initiate the study of differentially private (DP) estimation with access to a small amount of public data.

Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent

1 code implementation6 Jun 2022 Da Yu, Gautam Kamath, Janardhan Kulkarni, Tie-Yan Liu, Jian Yin, Huishuai Zhang

Differentially private stochastic gradient descent (DP-SGD) is the workhorse algorithm for recent advances in private deep learning.

New Lower Bounds for Private Estimation and a Generalized Fingerprinting Lemma

no code implementations17 May 2022 Gautam Kamath, Argyris Mouzakis, Vikrant Singhal

First, we provide tight lower bounds for private covariance estimation of Gaussian distributions.

LEMMA

Indiscriminate Data Poisoning Attacks on Neural Networks

1 code implementation19 Apr 2022 Yiwei Lu, Gautam Kamath, YaoLiang Yu

Data poisoning attacks, in which a malicious adversary aims to influence a model by injecting "poisoned" data into the training process, have attracted significant recent attention.

Data Poisoning

Efficient Mean Estimation with Pure Differential Privacy via a Sum-of-Squares Exponential Mechanism

no code implementations25 Nov 2021 Samuel B. Hopkins, Gautam Kamath, Mahbod Majid

SoS proofs to algorithms is a key theme in numerous recent works in high-dimensional algorithmic statistics -- estimators which apparently require exponential running time but whose analysis can be captured by low-degree Sum of Squares proofs can be automatically turned into polynomial-time algorithms with the same provable guarantees.

Robust Estimation for Random Graphs

no code implementations9 Nov 2021 Jayadev Acharya, Ayush Jain, Gautam Kamath, Ananda Theertha Suresh, Huanyu Zhang

We study the problem of robustly estimating the parameter $p$ of an Erd\H{o}s-R\'enyi random graph on $n$ nodes, where a $\gamma$ fraction of nodes may be adversarially corrupted.

The Role of Adaptive Optimizers for Honest Private Hyperparameter Selection

no code implementations NeurIPS 2021 Shubhankar Mohapatra, Sajin Sasy, Xi He, Gautam Kamath, Om Thakkar

Hyperparameter optimization is a ubiquitous challenge in machine learning, and the performance of a trained model depends crucially upon their effective selection.

BIG-bench Machine Learning Hyperparameter Optimization

A Private and Computationally-Efficient Estimator for Unbounded Gaussians

no code implementations8 Nov 2021 Gautam Kamath, Argyris Mouzakis, Vikrant Singhal, Thomas Steinke, Jonathan Ullman

We give the first polynomial-time, polynomial-sample, differentially private estimator for the mean and covariance of an arbitrary Gaussian distribution $\mathcal{N}(\mu,\Sigma)$ in $\mathbb{R}^d$.

Differentially Private Fine-tuning of Language Models

2 code implementations ICLR 2022 Da Yu, Saurabh Naik, Arturs Backurs, Sivakanth Gopi, Huseyin A. Inan, Gautam Kamath, Janardhan Kulkarni, Yin Tat Lee, Andre Manoel, Lukas Wutschitz, Sergey Yekhanin, Huishuai Zhang

For example, on the MNLI dataset we achieve an accuracy of $87. 8\%$ using RoBERTa-Large and $83. 5\%$ using RoBERTa-Base with a privacy budget of $\epsilon = 6. 7$.

Text Generation

The Price of Tolerance in Distribution Testing

no code implementations25 Jun 2021 Clément L. Canonne, Ayush Jain, Gautam Kamath, Jerry Li

Specifically, we show the sample complexity to be \[\tilde \Theta\left(\frac{\sqrt{n}}{\varepsilon_2^{2}} + \frac{n}{\log n} \cdot \max \left\{\frac{\varepsilon_1}{\varepsilon_2^2},\left(\frac{\varepsilon_1}{\varepsilon_2^2}\right)^{\!\! 2}\right\}\right),\] providing a smooth tradeoff between the two previously known cases.

Improved Rates for Differentially Private Stochastic Convex Optimization with Heavy-Tailed Data

no code implementations2 Jun 2021 Gautam Kamath, Xingtu Liu, Huanyu Zhang

Finally, we prove nearly-matching lower bounds for private stochastic convex optimization with strongly convex losses and mean estimation, showing new separations between pure and concentrated DP.

On the Sample Complexity of Privately Learning Unbounded High-Dimensional Gaussians

no code implementations19 Oct 2020 Ishaq Aden-Ali, Hassan Ashtiani, Gautam Kamath

These are the first finite sample upper bounds for general Gaussians which do not impose restrictions on the parameters of the distribution.

Vocal Bursts Intensity Prediction

Enabling Fast Differentially Private SGD via Just-in-Time Compilation and Vectorization

1 code implementation NeurIPS 2021 Pranav Subramani, Nicholas Vadivelu, Gautam Kamath

We also rebuild core parts of TensorFlow Privacy, integrating features from TensorFlow 2 as well as XLA compilation, granting significant memory and runtime improvements over the current release version.

CoinPress: Practical Private Mean and Covariance Estimation

3 code implementations NeurIPS 2020 Sourav Biswas, Yihe Dong, Gautam Kamath, Jonathan Ullman

We present simple differentially private estimators for the mean and covariance of multivariate sub-Gaussian data that are accurate at small sample sizes.

A Primer on Private Statistics

no code implementations30 Apr 2020 Gautam Kamath, Jonathan Ullman

Differentially private statistical estimation has seen a flurry of developments over the last several years.

The Discrete Gaussian for Differential Privacy

2 code implementations NeurIPS 2020 Clément L. Canonne, Gautam Kamath, Thomas Steinke

Specifically, we theoretically and experimentally show that adding discrete Gaussian noise provides essentially the same privacy and accuracy guarantees as the addition of continuous Gaussian noise.

PAPRIKA: Private Online False Discovery Rate Control

1 code implementation27 Feb 2020 Wanrong Zhang, Gautam Kamath, Rachel Cummings

In this work, we study False Discovery Rate (FDR) control in multiple hypothesis testing under the constraint of differential privacy for the sample.

Two-sample testing

Private Mean Estimation of Heavy-Tailed Distributions

no code implementations21 Feb 2020 Gautam Kamath, Vikrant Singhal, Jonathan Ullman

We give new upper and lower bounds on the minimax sample complexity of differentially private mean estimation of distributions with bounded $k$-th moments.

Locally Private Hypothesis Selection

no code implementations21 Feb 2020 Sivakanth Gopi, Gautam Kamath, Janardhan Kulkarni, Aleksandar Nikolov, Zhiwei Steven Wu, Huanyu Zhang

Absent privacy constraints, this problem requires $O(\log k)$ samples from $p$, and it was recently shown that the same complexity is achievable under (central) differential privacy.

Two-sample testing

Privately Learning Markov Random Fields

no code implementations ICML 2020 Huanyu Zhang, Gautam Kamath, Janardhan Kulkarni, Zhiwei Steven Wu

We consider the problem of learning Markov Random Fields (including the prototypical example, the Ising model) under the constraint of differential privacy.

Random Restrictions of High-Dimensional Distributions and Uniformity Testing with Subcube Conditioning

no code implementations17 Nov 2019 Clément L. Canonne, Xi Chen, Gautam Kamath, Amit Levi, Erik Waingarten

We give a nearly-optimal algorithm for testing uniformity of distributions supported on $\{-1, 1\}^n$, which makes $\tilde O (\sqrt{n}/\varepsilon^2)$ queries to a subcube conditional sampling oracle (Bhattacharyya and Chakraborty (2018)).

Differentially Private Algorithms for Learning Mixtures of Separated Gaussians

no code implementations NeurIPS 2019 Gautam Kamath, Or Sheffet, Vikrant Singhal, Jonathan Ullman

Learning the parameters of Gaussian mixture models is a fundamental and widely studied problem with numerous applications.

Private Hypothesis Selection

no code implementations NeurIPS 2019 Mark Bun, Gautam Kamath, Thomas Steinke, Zhiwei Steven Wu

The sample complexity of our basic algorithm is $O\left(\frac{\log m}{\alpha^2} + \frac{\log m}{\alpha \varepsilon}\right)$, representing a minimal cost for privacy when compared to the non-private algorithm.

PAC learning

Private Identity Testing for High-Dimensional Distributions

no code implementations NeurIPS 2020 Clément L. Canonne, Gautam Kamath, Audra McMillan, Jonathan Ullman, Lydia Zakynthinou

In this work we present novel differentially private identity (goodness-of-fit) testers for natural and widely studied classes of multivariate product distributions: Gaussians in $\mathbb{R}^d$ with known covariance and product distributions over $\{\pm 1\}^{d}$.

Vocal Bursts Intensity Prediction

The Structure of Optimal Private Tests for Simple Hypotheses

no code implementations27 Nov 2018 Clément L. Canonne, Gautam Kamath, Audra McMillan, Adam Smith, Jonathan Ullman

Specifically, we characterize this sample complexity up to constant factors in terms of the structure of $P$ and $Q$ and the privacy level $\varepsilon$, and show that this sample complexity is achieved by a certain randomized and clamped variant of the log-likelihood ratio test.

Change Point Detection Generalization Bounds +2

Anaconda: A Non-Adaptive Conditional Sampling Algorithm for Distribution Testing

no code implementations17 Jul 2018 Gautam Kamath, Christos Tzamos

This is an exponential improvement over the previous best upper bound, and demonstrates that the complexity of the problem in this model is intermediate to the the complexity of the problem in the standard sampling model and the adaptive conditional sampling model.

Privately Learning High-Dimensional Distributions

no code implementations1 May 2018 Gautam Kamath, Jerry Li, Vikrant Singhal, Jonathan Ullman

We present novel, computationally efficient, and differentially private algorithms for two fundamental high-dimensional learning problems: learning a multivariate Gaussian and learning a product distribution over the Boolean hypercube in total variation distance.

Vocal Bursts Intensity Prediction

Sever: A Robust Meta-Algorithm for Stochastic Optimization

1 code implementation7 Mar 2018 Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Jacob Steinhardt, Alistair Stewart

In high dimensions, most machine learning methods are brittle to even a small fraction of structured outliers.

Stochastic Optimization

INSPECTRE: Privately Estimating the Unseen

1 code implementation ICML 2018 Jayadev Acharya, Gautam Kamath, Ziteng Sun, Huanyu Zhang

We develop differentially private methods for estimating various distributional properties.

Actively Avoiding Nonsense in Generative Models

no code implementations20 Feb 2018 Steve Hanneke, Adam Kalai, Gautam Kamath, Christos Tzamos

A generative model may generate utter nonsense when it is fit to maximize the likelihood of observed data.

Which Distribution Distances are Sublinearly Testable?

no code implementations31 Jul 2017 Constantinos Daskalakis, Gautam Kamath, John Wright

Given samples from an unknown distribution $p$ and a description of a distribution $q$, are $p$ and $q$ close or far?

Robustly Learning a Gaussian: Getting Optimal Error, Efficiently

no code implementations12 Apr 2017 Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Ankur Moitra, Alistair Stewart

We give robust estimators that achieve estimation error $O(\varepsilon)$ in the total variation distance, which is optimal up to a universal constant that is independent of the dimension.

Being Robust (in High Dimensions) Can Be Practical

2 code implementations ICML 2017 Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Ankur Moitra, Alistair Stewart

Robust estimation is much more challenging in high dimensions than it is in one dimension: Most techniques either lead to intractable optimization problems or estimators that can tolerate only a tiny fraction of errors.

Vocal Bursts Intensity Prediction

Testing Ising Models

no code implementations9 Dec 2016 Constantinos Daskalakis, Nishanth Dikkala, Gautam Kamath

Given samples from an unknown multivariate distribution $p$, is it possible to distinguish whether $p$ is the product of its marginals versus $p$ being far from every product distribution?

Robust Estimators in High Dimensions without the Computational Intractability

2 code implementations21 Apr 2016 Ilias Diakonikolas, Gautam Kamath, Daniel Kane, Jerry Li, Ankur Moitra, Alistair Stewart

We study high-dimensional distribution learning in an agnostic setting where an adversary is allowed to arbitrarily corrupt an $\varepsilon$-fraction of the samples.

Vocal Bursts Intensity Prediction

A Size-Free CLT for Poisson Multinomials and its Applications

no code implementations11 Nov 2015 Constantinos Daskalakis, Anindya De, Gautam Kamath, Christos Tzamos

Finally, leveraging the structural properties of the Fourier spectrum of PMDs we show that these distributions can be learned from $O_k(1/\varepsilon^2)$ samples in ${\rm poly}_k(1/\varepsilon)$-time, removing the quasi-polynomial dependence of the running time on $1/\varepsilon$ from the algorithm of Daskalakis, Kamath, and Tzamos.

Optimal Testing for Properties of Distributions

no code implementations NeurIPS 2015 Jayadev Acharya, Constantinos Daskalakis, Gautam Kamath

Given samples from an unknown distribution $p$, is it possible to distinguish whether $p$ belongs to some class of distributions $\mathcal{C}$ versus $p$ being far from every distribution in $\mathcal{C}$?

On the Structure, Covering, and Learning of Poisson Multinomial Distributions

no code implementations30 Apr 2015 Constantinos Daskalakis, Gautam Kamath, Christos Tzamos

We prove a structural characterization of these distributions, showing that, for all $\varepsilon >0$, any $(n, k)$-Poisson multinomial random vector is $\varepsilon$-close, in total variation distance, to the sum of a discretized multidimensional Gaussian and an independent $(\text{poly}(k/\varepsilon), k)$-Poisson multinomial random vector.

A Chasm Between Identity and Equivalence Testing with Conditional Queries

no code implementations26 Nov 2014 Jayadev Acharya, Clément L. Canonne, Gautam Kamath

We answer a question of Chakraborty et al. (ITCS 2013) showing that non-adaptive uniformity testing indeed requires $\Omega(\log n)$ queries in the conditional model.

Faster and Sample Near-Optimal Algorithms for Proper Learning Mixtures of Gaussians

no code implementations4 Dec 2013 Constantinos Daskalakis, Gautam Kamath

The algorithm requires ${O}(\log{N}/\varepsilon^2)$ samples from the unknown distribution and ${O}(N \log N/\varepsilon^2)$ time, which improves previous such results (such as the Scheff\'e estimator) from a quadratic dependence of the running time on $N$ to quasilinear.

Cannot find the paper you are looking for? You can Submit a new open access paper.