Search Results for author: Gautam Kamath

Found 54 papers, 18 papers with code

Disguised Copyright Infringement of Latent Diffusion Models

no code implementations • 10 Apr 2024 • Yiwei Lu, Matthew Y. R. Yang, Zuoqiu Liu, Gautam Kamath, YaoLiang Yu

Copyright infringement may occur when a generative model produces samples substantially similar to some copyrighted data that it had access to during the training phase.

Paper
Add Code

Indiscriminate Data Poisoning Attacks on Pre-trained Feature Extractors

no code implementations • 20 Feb 2024 • Yiwei Lu, Matthew Y. R. Yang, Gautam Kamath, YaoLiang Yu

In this paper, we extend the exploration of the threat of indiscriminate attacks on downstream tasks that apply pre-trained feature extractors.

Data Poisoning Domain Adaptation +2

Paper
Add Code

Not All Learnable Distribution Classes are Privately Learnable

no code implementations • 1 Feb 2024 • Mark Bun, Gautam Kamath, Argyris Mouzakis, Vikrant Singhal

We give an example of a class of distributions that is learnable in total variation distance with a finite number of samples, but not learnable under $(\varepsilon, \delta)$-differential privacy.

Paper
Add Code

Exploring the Limits of Model-Targeted Indiscriminate Data Poisoning Attacks

1 code implementation • 7 Mar 2023 • Yiwei Lu, Gautam Kamath, YaoLiang Yu

Building on existing parameter corruption attacks and refining the Gradient Canceling attack, we perform extensive experiments to confirm our theoretical findings, test the predictability of our transition threshold, and significantly improve existing indiscriminate data poisoning baselines over a range of datasets and models.

Data Poisoning Model Poisoning

Paper
Code

Choosing Public Datasets for Private Machine Learning via Gradient Subspace Distance

no code implementations • 2 Mar 2023 • Xin Gu, Gautam Kamath, Zhiwei Steven Wu

We give an algorithm for selecting a public dataset by measuring a low-dimensional subspace distance between gradients of the public and private examples.

Paper
Add Code

Private GANs, Revisited

1 code implementation • 6 Feb 2023 • Alex Bie, Gautam Kamath, Guojun Zhang

We show that the canonical approach for training differentially private GANs -- updating the discriminator with differentially private stochastic gradient descent (DPSGD) -- can yield significantly improved results after modifications to training.

Image Generation

Paper
Code

A Bias-Variance-Privacy Trilemma for Statistical Estimation

no code implementations • 30 Jan 2023 • Gautam Kamath, Argyris Mouzakis, Matthew Regehr, Vikrant Singhal, Thomas Steinke, Jonathan Ullman

The canonical algorithm for differentially private mean estimation is to first clip the samples to a bounded range and then add noise to their empirical mean.

Paper
Add Code

Hidden Poison: Machine Unlearning Enables Camouflaged Poisoning Attacks

1 code implementation • NeurIPS 2023 • Jimmy Z. Di, Jack Douglas, Jayadev Acharya, Gautam Kamath, Ayush Sekhari

We introduce camouflaged data poisoning attacks, a new attack vector that arises in the context of machine unlearning and other settings when model retraining may be induced.

Data Poisoning Machine Unlearning

Paper
Code

Considerations for Differentially Private Learning with Large-Scale Public Pretraining

2 code implementations • 13 Dec 2022 • Florian Tramèr, Gautam Kamath, Nicholas Carlini

The performance of differentially private machine learning can be boosted significantly by leveraging the transfer learning capabilities of non-private models pretrained on large public datasets.

Privacy Preserving Transfer Learning

Paper
Code

Robustness Implies Privacy in Statistical Estimation

no code implementations • 9 Dec 2022 • Samuel B. Hopkins, Gautam Kamath, Mahbod Majid, Shyam Narayanan

We study the relationship between adversarial robustness and differential privacy in high-dimensional algorithmic statistics.

Adversarial Robustness

Paper
Add Code

Private Estimation with Public Data

1 code implementation • 16 Aug 2022 • Alex Bie, Gautam Kamath, Vikrant Singhal

We initiate the study of differentially private (DP) estimation with access to a small amount of public data.

Paper
Code

Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent

1 code implementation • 6 Jun 2022 • Da Yu, Gautam Kamath, Janardhan Kulkarni, Tie-Yan Liu, Jian Yin, Huishuai Zhang

Differentially private stochastic gradient descent (DP-SGD) is the workhorse algorithm for recent advances in private deep learning.

Paper
Code

New Lower Bounds for Private Estimation and a Generalized Fingerprinting Lemma

no code implementations • 17 May 2022 • Gautam Kamath, Argyris Mouzakis, Vikrant Singhal

First, we provide tight lower bounds for private covariance estimation of Gaussian distributions.

LEMMA

Paper
Add Code

Indiscriminate Data Poisoning Attacks on Neural Networks

1 code implementation • 19 Apr 2022 • Yiwei Lu, Gautam Kamath, YaoLiang Yu

Data poisoning attacks, in which a malicious adversary aims to influence a model by injecting "poisoned" data into the training process, have attracted significant recent attention.

Data Poisoning

Paper
Code

Efficient Mean Estimation with Pure Differential Privacy via a Sum-of-Squares Exponential Mechanism

no code implementations • 25 Nov 2021 • Samuel B. Hopkins, Gautam Kamath, Mahbod Majid

SoS proofs to algorithms is a key theme in numerous recent works in high-dimensional algorithmic statistics -- estimators which apparently require exponential running time but whose analysis can be captured by low-degree Sum of Squares proofs can be automatically turned into polynomial-time algorithms with the same provable guarantees.

Paper
Add Code

Robust Estimation for Random Graphs

no code implementations • 9 Nov 2021 • Jayadev Acharya, Ayush Jain, Gautam Kamath, Ananda Theertha Suresh, Huanyu Zhang

We study the problem of robustly estimating the parameter $p$ of an Erd\H{o}s-R\'enyi random graph on $n$ nodes, where a $\gamma$ fraction of nodes may be adversarially corrupted.

Paper
Add Code

The Role of Adaptive Optimizers for Honest Private Hyperparameter Selection

no code implementations • NeurIPS 2021 • Shubhankar Mohapatra, Sajin Sasy, Xi He, Gautam Kamath, Om Thakkar

Hyperparameter optimization is a ubiquitous challenge in machine learning, and the performance of a trained model depends crucially upon their effective selection.

BIG-bench Machine Learning Hyperparameter Optimization

Paper
Add Code

A Private and Computationally-Efficient Estimator for Unbounded Gaussians

no code implementations • 8 Nov 2021 • Gautam Kamath, Argyris Mouzakis, Vikrant Singhal, Thomas Steinke, Jonathan Ullman

We give the first polynomial-time, polynomial-sample, differentially private estimator for the mean and covariance of an arbitrary Gaussian distribution $\mathcal{N}(\mu,\Sigma)$ in $\mathbb{R}^d$.

Paper
Add Code

Differentially Private Fine-tuning of Language Models

2 code implementations • ICLR 2022 • Da Yu, Saurabh Naik, Arturs Backurs, Sivakanth Gopi, Huseyin A. Inan, Gautam Kamath, Janardhan Kulkarni, Yin Tat Lee, Andre Manoel, Lukas Wutschitz, Sergey Yekhanin, Huishuai Zhang

For example, on the MNLI dataset we achieve an accuracy of $87. 8\%$ using RoBERTa-Large and $83. 5\%$ using RoBERTa-Base with a privacy budget of $\epsilon = 6. 7$.

Text Generation

Paper
Code

The Price of Tolerance in Distribution Testing

no code implementations • 25 Jun 2021 • Clément L. Canonne, Ayush Jain, Gautam Kamath, Jerry Li

Specifically, we show the sample complexity to be \[\tilde \Theta\left(\frac{\sqrt{n}}{\varepsilon_2^{2}} + \frac{n}{\log n} \cdot \max \left\{\frac{\varepsilon_1}{\varepsilon_2^2},\left(\frac{\varepsilon_1}{\varepsilon_2^2}\right)^{\!\! 2}\right\}\right),\] providing a smooth tradeoff between the two previously known cases.

Paper
Add Code

Improved Rates for Differentially Private Stochastic Convex Optimization with Heavy-Tailed Data

no code implementations • 2 Jun 2021 • Gautam Kamath, Xingtu Liu, Huanyu Zhang

Finally, we prove nearly-matching lower bounds for private stochastic convex optimization with strongly convex losses and mean estimation, showing new separations between pure and concentrated DP.

Paper
Add Code

Remember What You Want to Forget: Algorithms for Machine Unlearning

no code implementations • NeurIPS 2021 • Ayush Sekhari, Jayadev Acharya, Gautam Kamath, Ananda Theertha Suresh

We study the problem of unlearning datapoints from a learnt model.

Machine Unlearning

Paper
Add Code

On the Sample Complexity of Privately Learning Unbounded High-Dimensional Gaussians

no code implementations • 19 Oct 2020 • Ishaq Aden-Ali, Hassan Ashtiani, Gautam Kamath

These are the first finite sample upper bounds for general Gaussians which do not impose restrictions on the parameters of the distribution.

Vocal Bursts Intensity Prediction

Paper
Add Code

Enabling Fast Differentially Private SGD via Just-in-Time Compilation and Vectorization

1 code implementation • NeurIPS 2021 • Pranav Subramani, Nicholas Vadivelu, Gautam Kamath

We also rebuild core parts of TensorFlow Privacy, integrating features from TensorFlow 2 as well as XLA compilation, granting significant memory and runtime improvements over the current release version.

Paper
Code

CoinPress: Practical Private Mean and Covariance Estimation

3 code implementations • NeurIPS 2020 • Sourav Biswas, Yihe Dong, Gautam Kamath, Jonathan Ullman

We present simple differentially private estimators for the mean and covariance of multivariate sub-Gaussian data that are accurate at small sample sizes.

Paper
Code

A Primer on Private Statistics

no code implementations • 30 Apr 2020 • Gautam Kamath, Jonathan Ullman

Differentially private statistical estimation has seen a flurry of developments over the last several years.

Paper
Add Code

The Discrete Gaussian for Differential Privacy

2 code implementations • NeurIPS 2020 • Clément L. Canonne, Gautam Kamath, Thomas Steinke

Specifically, we theoretically and experimentally show that adding discrete Gaussian noise provides essentially the same privacy and accuracy guarantees as the addition of continuous Gaussian noise.

Paper
Code

PAPRIKA: Private Online False Discovery Rate Control

1 code implementation • 27 Feb 2020 • Wanrong Zhang, Gautam Kamath, Rachel Cummings

In this work, we study False Discovery Rate (FDR) control in multiple hypothesis testing under the constraint of differential privacy for the sample.

Two-sample testing

Paper
Code

Private Mean Estimation of Heavy-Tailed Distributions

no code implementations • 21 Feb 2020 • Gautam Kamath, Vikrant Singhal, Jonathan Ullman

We give new upper and lower bounds on the minimax sample complexity of differentially private mean estimation of distributions with bounded $k$-th moments.

Paper
Add Code

Locally Private Hypothesis Selection

no code implementations • 21 Feb 2020 • Sivakanth Gopi, Gautam Kamath, Janardhan Kulkarni, Aleksandar Nikolov, Zhiwei Steven Wu, Huanyu Zhang

Absent privacy constraints, this problem requires $O(\log k)$ samples from $p$, and it was recently shown that the same complexity is achievable under (central) differential privacy.

Two-sample testing

Paper
Add Code

Privately Learning Markov Random Fields

no code implementations • ICML 2020 • Huanyu Zhang, Gautam Kamath, Janardhan Kulkarni, Zhiwei Steven Wu

We consider the problem of learning Markov Random Fields (including the prototypical example, the Ising model) under the constraint of differential privacy.

Paper
Add Code

Random Restrictions of High-Dimensional Distributions and Uniformity Testing with Subcube Conditioning

no code implementations • 17 Nov 2019 • Clément L. Canonne, Xi Chen, Gautam Kamath, Amit Levi, Erik Waingarten

We give a nearly-optimal algorithm for testing uniformity of distributions supported on $\{-1, 1\}^n$, which makes $\tilde O (\sqrt{n}/\varepsilon^2)$ queries to a subcube conditional sampling oracle (Bhattacharyya and Chakraborty (2018)).

Paper
Add Code

Differentially Private Algorithms for Learning Mixtures of Separated Gaussians

no code implementations • NeurIPS 2019 • Gautam Kamath, Or Sheffet, Vikrant Singhal, Jonathan Ullman

Learning the parameters of Gaussian mixture models is a fundamental and widely studied problem with numerous applications.

Paper
Add Code

Private Hypothesis Selection

no code implementations • NeurIPS 2019 • Mark Bun, Gautam Kamath, Thomas Steinke, Zhiwei Steven Wu

The sample complexity of our basic algorithm is $O\left(\frac{\log m}{\alpha^2} + \frac{\log m}{\alpha \varepsilon}\right)$, representing a minimal cost for privacy when compared to the non-private algorithm.

PAC learning

Paper
Add Code

Private Identity Testing for High-Dimensional Distributions

no code implementations • NeurIPS 2020 • Clément L. Canonne, Gautam Kamath, Audra McMillan, Jonathan Ullman, Lydia Zakynthinou

In this work we present novel differentially private identity (goodness-of-fit) testers for natural and widely studied classes of multivariate product distributions: Gaussians in $\mathbb{R}^d$ with known covariance and product distributions over $\{\pm 1\}^{d}$.

Vocal Bursts Intensity Prediction

Paper
Add Code

The Structure of Optimal Private Tests for Simple Hypotheses

no code implementations • 27 Nov 2018 • Clément L. Canonne, Gautam Kamath, Audra McMillan, Adam Smith, Jonathan Ullman

Specifically, we characterize this sample complexity up to constant factors in terms of the structure of $P$ and $Q$ and the privacy level $\varepsilon$, and show that this sample complexity is achieved by a certain randomized and clamped variant of the log-likelihood ratio test.

Change Point Detection Generalization Bounds +2

Paper
Add Code

Anaconda: A Non-Adaptive Conditional Sampling Algorithm for Distribution Testing

no code implementations • 17 Jul 2018 • Gautam Kamath, Christos Tzamos

This is an exponential improvement over the previous best upper bound, and demonstrates that the complexity of the problem in this model is intermediate to the the complexity of the problem in the standard sampling model and the adaptive conditional sampling model.

Paper
Add Code

Privately Learning High-Dimensional Distributions

no code implementations • 1 May 2018 • Gautam Kamath, Jerry Li, Vikrant Singhal, Jonathan Ullman

We present novel, computationally efficient, and differentially private algorithms for two fundamental high-dimensional learning problems: learning a multivariate Gaussian and learning a product distribution over the Boolean hypercube in total variation distance.

Vocal Bursts Intensity Prediction

Paper
Add Code

Sever: A Robust Meta-Algorithm for Stochastic Optimization

1 code implementation • 7 Mar 2018 • Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Jacob Steinhardt, Alistair Stewart

In high dimensions, most machine learning methods are brittle to even a small fraction of structured outliers.

Stochastic Optimization

Paper
Code

INSPECTRE: Privately Estimating the Unseen

1 code implementation • ICML 2018 • Jayadev Acharya, Gautam Kamath, Ziteng Sun, Huanyu Zhang

We develop differentially private methods for estimating various distributional properties.

Paper
Code

Actively Avoiding Nonsense in Generative Models

no code implementations • 20 Feb 2018 • Steve Hanneke, Adam Kalai, Gautam Kamath, Christos Tzamos

A generative model may generate utter nonsense when it is fit to maximize the likelihood of observed data.

Paper
Add Code

Concentration of Multilinear Functions of the Ising Model with Applications to Network Data

1 code implementation • NeurIPS 2017 • Constantinos Daskalakis, Nishanth Dikkala, Gautam Kamath

We prove near-tight concentration of measure for polynomial functions of the Ising model under high temperature.

Paper
Code

Priv’IT: Private and Sample Efficient Identity Testing

no code implementations • ICML 2017 • Bryan Cai, Constantinos Daskalakis, Gautam Kamath

We develop differentially private hypothesis testing methods for the small sample regime.

Two-sample testing Vocal Bursts Type Prediction

Paper
Add Code

Which Distribution Distances are Sublinearly Testable?

no code implementations • 31 Jul 2017 • Constantinos Daskalakis, Gautam Kamath, John Wright

Given samples from an unknown distribution $p$ and a description of a distribution $q$, are $p$ and $q$ close or far?

Paper
Add Code

Robustly Learning a Gaussian: Getting Optimal Error, Efficiently

no code implementations • 12 Apr 2017 • Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Ankur Moitra, Alistair Stewart

We give robust estimators that achieve estimation error $O(\varepsilon)$ in the total variation distance, which is optimal up to a universal constant that is independent of the dimension.

Paper
Add Code

Priv'IT: Private and Sample Efficient Identity Testing

1 code implementation • 29 Mar 2017 • Bryan Cai, Constantinos Daskalakis, Gautam Kamath

We develop differentially private hypothesis testing methods for the small sample regime.

Two-sample testing Vocal Bursts Type Prediction

Paper
Code

Being Robust (in High Dimensions) Can Be Practical

2 code implementations • ICML 2017 • Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Ankur Moitra, Alistair Stewart

Robust estimation is much more challenging in high dimensions than it is in one dimension: Most techniques either lead to intractable optimization problems or estimators that can tolerate only a tiny fraction of errors.

Vocal Bursts Intensity Prediction

Paper
Code

Testing Ising Models

no code implementations • 9 Dec 2016 • Constantinos Daskalakis, Nishanth Dikkala, Gautam Kamath

Given samples from an unknown multivariate distribution $p$, is it possible to distinguish whether $p$ is the product of its marginals versus $p$ being far from every product distribution?

Paper
Add Code

Robust Estimators in High Dimensions without the Computational Intractability

2 code implementations • 21 Apr 2016 • Ilias Diakonikolas, Gautam Kamath, Daniel Kane, Jerry Li, Ankur Moitra, Alistair Stewart

We study high-dimensional distribution learning in an agnostic setting where an adversary is allowed to arbitrarily corrupt an $\varepsilon$-fraction of the samples.

Vocal Bursts Intensity Prediction

Paper
Code

A Size-Free CLT for Poisson Multinomials and its Applications

no code implementations • 11 Nov 2015 • Constantinos Daskalakis, Anindya De, Gautam Kamath, Christos Tzamos

Finally, leveraging the structural properties of the Fourier spectrum of PMDs we show that these distributions can be learned from $O_k(1/\varepsilon^2)$ samples in ${\rm poly}_k(1/\varepsilon)$-time, removing the quasi-polynomial dependence of the running time on $1/\varepsilon$ from the algorithm of Daskalakis, Kamath, and Tzamos.

Paper
Add Code

Optimal Testing for Properties of Distributions

no code implementations • NeurIPS 2015 • Jayadev Acharya, Constantinos Daskalakis, Gautam Kamath

Given samples from an unknown distribution $p$, is it possible to distinguish whether $p$ belongs to some class of distributions $\mathcal{C}$ versus $p$ being far from every distribution in $\mathcal{C}$?

Paper
Add Code

On the Structure, Covering, and Learning of Poisson Multinomial Distributions

no code implementations • 30 Apr 2015 • Constantinos Daskalakis, Gautam Kamath, Christos Tzamos

We prove a structural characterization of these distributions, showing that, for all $\varepsilon >0$, any $(n, k)$-Poisson multinomial random vector is $\varepsilon$-close, in total variation distance, to the sum of a discretized multidimensional Gaussian and an independent $(\text{poly}(k/\varepsilon), k)$-Poisson multinomial random vector.

Paper
Add Code

A Chasm Between Identity and Equivalence Testing with Conditional Queries

no code implementations • 26 Nov 2014 • Jayadev Acharya, Clément L. Canonne, Gautam Kamath

We answer a question of Chakraborty et al. (ITCS 2013) showing that non-adaptive uniformity testing indeed requires $\Omega(\log n)$ queries in the conditional model.

Paper
Add Code

Faster and Sample Near-Optimal Algorithms for Proper Learning Mixtures of Gaussians

no code implementations • 4 Dec 2013 • Constantinos Daskalakis, Gautam Kamath

The algorithm requires ${O}(\log{N}/\varepsilon^2)$ samples from the unknown distribution and ${O}(N \log N/\varepsilon^2)$ time, which improves previous such results (such as the Scheff\'e estimator) from a quadratic dependence of the running time on $N$ to quasilinear.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.