no code implementations • ICML 2020 • Jayadev Acharya, Kallista Bonawitz, Peter Kairouz, Daniel Ramage, Ziteng Sun
The original definition of LDP assumes that all the elements in the data domain are equally sensitive.
1 code implementation • NeurIPS 2023 • Jimmy Z. Di, Jack Douglas, Jayadev Acharya, Gautam Kamath, Ayush Sekhari
We introduce camouflaged data poisoning attacks, a new attack vector that arises in the context of machine unlearning and other settings when model retraining may be induced.
no code implementations • 7 Nov 2022 • Jayadev Acharya, YuHan Liu, Ziteng Sun
Perhaps surprisingly, we show that in suitable parameter regimes, having $m$ samples per user is equivalent to having $m$ times more users, each with only one sample.
no code implementations • 14 Mar 2022 • Jayadev Acharya, Clément L. Canonne, Ziteng Sun, Himanshu Tyagi
Without sparsity assumptions, it has been established that interactivity cannot improve the minimax rates of estimation under these information constraints.
no code implementations • NeurIPS 2021 • Jayadev Acharya, Clement Canonne, YuHan Liu, Ziteng Sun, Himanshu Tyagi
We obtain tight minimax rates for the problem of distributed estimation of discrete distributions under communication constraints, where $n$ users observing $m $ samples each can broadcast only $\ell$ bits.
no code implementations • 9 Nov 2021 • Jayadev Acharya, Ayush Jain, Gautam Kamath, Ananda Theertha Suresh, Huanyu Zhang
We study the problem of robustly estimating the parameter $p$ of an Erd\H{o}s-R\'enyi random graph on $n$ nodes, where a $\gamma$ fraction of nodes may be adversarially corrupted.
1 code implementation • 5 Jun 2021 • Sourbh Bhadane, Aaron B. Wagner, Jayadev Acharya
As one application, we consider a strictly Schur-concave constraint that estimates the number of bits needed to represent the latent variables under fixed-rate encoding, a setup that we call \emph{Principal Bit Analysis (PBA)}.
no code implementations • 21 Apr 2021 • Jayadev Acharya, Ziteng Sun, Huanyu Zhang
We consider both the "centralized setting" and the "distributed setting with information constraints" including communication and local privacy (LDP) constraints.
no code implementations • NeurIPS 2021 • Ayush Sekhari, Jayadev Acharya, Gautam Kamath, Ananda Theertha Suresh
We study the problem of unlearning datapoints from a learnt model.
no code implementations • 30 Oct 2020 • Jayadev Acharya, Peter Kairouz, YuHan Liu, Ziteng Sun
We consider the problem of estimating sparse discrete distributions under local differential privacy (LDP) and communication constraints.
no code implementations • 21 Jul 2020 • Jayadev Acharya, Clément L. Canonne, Yu-Han Liu, Ziteng Sun, Himanshu Tyagi
We study the role of interactivity in distributed statistical inference under information constraints, e. g., communication constraints and local differential privacy.
no code implementations • 14 Apr 2020 • Jayadev Acharya, Ziteng Sun, Huanyu Zhang
The technical component of our paper relates coupling between distributions to the sample complexity of estimation under differential privacy.
no code implementations • NeurIPS 2019 • Jayadev Acharya, Sourbh Bhadane, Piotr Indyk, Ziteng Sun
We consider the task of estimating the entropy of $k$-ary distributions from samples in the streaming model, where space is limited.
no code implementations • 31 Oct 2019 • Jayadev Acharya, Keith Bonawitz, Peter Kairouz, Daniel Ramage, Ziteng Sun
Local differential privacy (LDP) is a strong notion of privacy for individual users that often comes at the expense of a significant drop in utility.
no code implementations • 8 Aug 2019 • Jayadev Acharya, Ananda Theertha Suresh
A primary concern of excessive reuse of test datasets in machine learning is that it can lead to overfitting.
no code implementations • 20 Jul 2019 • Jayadev Acharya, Clément L. Canonne, Yanjun Han, Ziteng Sun, Himanshu Tyagi
We study goodness-of-fit of discrete distributions in the distributed setting, where samples are divided between multiple users who can only release a limited amount of information about their samples due to various information constraints.
no code implementations • 28 May 2019 • Jayadev Acharya, Ziteng Sun
We consider the problems of distribution estimation and heavy hitter (frequency) estimation under privacy and communication constraints.
no code implementations • 20 May 2019 • Jayadev Acharya, Clément L. Canonne, Himanshu Tyagi
We propose a general-purpose simulate-and-infer strategy that uses only private-coin communication protocols and is sample-optimal for distribution learning.
no code implementations • 28 Feb 2019 • Jayadev Acharya, Christopher De Sa, Dylan J. Foster, Karthik Sridharan
In distributed statistical learning, $N$ samples are split across $m$ machines and a learner wishes to use minimal communication to learn as well as if the examples were on a single machine.
no code implementations • 30 Dec 2018 • Jayadev Acharya, Clément L. Canonne, Himanshu Tyagi
Underlying our bounds is a characterization of the contraction in chi-square distances between the observed distributions of the samples when information constraints are placed.
no code implementations • 7 Aug 2018 • Jayadev Acharya, Clément L. Canonne, Cody Freitag, Himanshu Tyagi
We are concerned with two settings: First, when we insist on using an already deployed, general-purpose locally differentially private mechanism such as the popular RAPPOR or the recently introduced Hadamard Response for collecting data, and must build our tests based on the data collected via this mechanism; and second, when no such restriction is imposed, and we can design a bespoke mechanism specifically for testing.
no code implementations • NeurIPS 2018 • Jayadev Acharya, Arnab Bhattacharyya, Constantinos Daskalakis, Saravanan Kandasamy
We consider testing and learning problems on causal Bayesian networks as defined by Pearl (Pearl, 2009).
no code implementations • 19 Apr 2018 • Jayadev Acharya, Clément L. Canonne, Himanshu Tyagi
Nonetheless, we present a Las Vegas algorithm that simulates a single sample from the unknown distribution using $O(k/2^\ell)$ samples in expectation.
1 code implementation • ICML 2018 • Jayadev Acharya, Gautam Kamath, Ziteng Sun, Huanyu Zhang
We develop differentially private methods for estimating various distributional properties.
3 code implementations • 13 Feb 2018 • Jayadev Acharya, Ziteng Sun, Huanyu Zhang
All previously known sample optimal algorithms require linear (in $k$) communication from each user in the high privacy regime $(\varepsilon=O(1))$, and run in time that grows as $n\cdot k$, which can be prohibitive for large domain size $k$.
no code implementations • ICML 2017 • Jayadev Acharya, Hirakendu Das, Alon Orlitsky, Ananda Theertha Suresh
Symmetric distribution properties such as support size, support coverage, entropy, and proximity to uniformity, arise in many applications.
no code implementations • NeurIPS 2018 • Jayadev Acharya, Ziteng Sun, Huanyu Zhang
We propose a general framework to establish lower bounds on the sample complexity of statistical tasks under differential privacy.
no code implementations • 9 Nov 2016 • Jayadev Acharya, Hirakendu Das, Alon Orlitsky, Ananda Theertha Suresh
The advent of data science has spurred interest in estimating properties of distributions over large alphabets.
no code implementations • 14 Jul 2016 • Jayadev Acharya, Ilias Diakonikolas, Jerry Li, Ludwig Schmidt
We study the fixed design segmented regression problem: Given noisy samples from a piecewise linear function $f$, we want to recover $f$ up to a desired accuracy in mean-squared error.
no code implementations • NeurIPS 2015 • Jayadev Acharya, Constantinos Daskalakis, Gautam Kamath
Given samples from an unknown distribution $p$, is it possible to distinguish whether $p$ belongs to some class of distributions $\mathcal{C}$ versus $p$ being far from every distribution in $\mathcal{C}$?
no code implementations • 1 Jun 2015 • Jayadev Acharya, Ilias Diakonikolas, Jerry Li, Ludwig Schmidt
Let $f$ be the density function of an arbitrary univariate distribution, and suppose that $f$ is $\mathrm{OPT}$-close in $L_1$-distance to an unknown piecewise polynomial function with $t$ interval pieces and degree $d$.
no code implementations • 26 Nov 2014 • Jayadev Acharya, Clément L. Canonne, Gautam Kamath
We answer a question of Chakraborty et al. (ITCS 2013) showing that non-adaptive uniformity testing indeed requires $\Omega(\log n)$ queries in the conditional model.
no code implementations • 13 Oct 2014 • Jayadev Acharya, Constantinos Daskalakis
We provide a sample near-optimal algorithm for testing whether a distribution $P$ supported on $\{0,..., n\}$ to which we have sample access is a Poisson Binomial distribution, or far from all Poisson Binomial distributions.
no code implementations • 2 Aug 2014 • Jayadev Acharya, Alon Orlitsky, Ananda Theertha Suresh, Himanshu Tyagi
It was recently shown that estimating the Shannon entropy $H({\rm p})$ of a discrete $k$-symbol distribution ${\rm p}$ requires $\Theta(k/\log k)$ samples, a number that grows near-linearly in the support size.
no code implementations • 29 May 2014 • Jayadev Acharya, Ashkan Jafarpour, Alon Orlitsky, Ananda Theertha Suresh
The Poisson-sampling technique eliminates dependencies among symbol appearances in a random sequence.
no code implementations • NeurIPS 2014 • Jayadev Acharya, Ashkan Jafarpour, Alon Orlitsky, Ananda Theertha Suresh
For mixtures of any $k$ $d$-dimensional spherical Gaussians, we derive an intuitive spectral-estimator that uses $\mathcal{O}_k\bigl(\frac{d\log^2d}{\epsilon^4}\bigr)$ samples and runs in time $\mathcal{O}_{k,\epsilon}(d^3\log^5 d)$, both significantly lower than previously known.