no code implementations • ICML 2020 • Moein Falahatgar, Alon Orlitsky, Venkatadheeraj Pichapati
To derive these results we consider a probabilistic setting where several candidates for a position are asked multiple questions with the goal of finding who has the highest probability of answering interview questions correctly.
no code implementations • 5 Sep 2023 • Ayush Jain, Rajat Sen, Weihao Kong, Abhimanyu Das, Alon Orlitsky
A common approach assumes that the sources fall in one of several unknown subgroups, each with an unknown input distribution and input-output relationship.
no code implementations • 15 Feb 2022 • Yi Hao, Ayush Jain, Alon Orlitsky, Vaishakh Ravindrakumar
We derive a near-linear-time and essentially sample-optimal estimator that establishes $c_{t, d}=2$ for all $(t, d)\ne(1, 0)$.
no code implementations • 11 Feb 2022 • Ayush Jain, Alon Orlitsky, Vaishakh Ravindrakumar
However, their vast majority approach optimal accuracy only when given a tight upper bound on the fraction of corrupt data.
no code implementations • NeurIPS 2020 • Yi Hao, Alon Orlitsky
The profile of a sample is the multiset of its symbol frequencies.
no code implementations • NeurIPS 2020 • Ayush Jain, Alon Orlitsky
Many latent-variable applications, including community detection, collaborative filtering, genomic analysis, and NLP, model data as generated by low-rank matrices.
no code implementations • 26 Feb 2020 • Yi Hao, Alon Orlitsky
The profile of a sample is the multiset of its symbol frequencies.
no code implementations • NeurIPS 2020 • Ayush Jain, Alon Orlitsky
In many applications, data is collected in batches, some of which are corrupt or even adversarial.
no code implementations • NeurIPS 2020 • Yi Hao, Ayush Jain, Alon Orlitsky, Vaishakh Ravindrakumar
Sample- and computationally-efficient distribution estimation is a fundamental tenet in statistics and machine learning.
no code implementations • ICML 2020 • Ayush Jain, Alon Orlitsky
Previous estimators for this setting ran in exponential time, and for some regimes required a suboptimal number of batches.
no code implementations • NeurIPS 2019 • Yi Hao, Alon Orlitsky
We consider the fundamental learning problem of estimating properties of distributions over large domains.
1 code implementation • NeurIPS 2019 • Yi Hao, Alon Orlitsky
In particular, for every alphabet size $k$ and desired accuracy $\varepsilon$: $\textbf{Distribution estimation}$ Under $\ell_1$ distance, PML yields optimal $\Theta(k/(\varepsilon^2\log k))$ sample complexity for sorted-distribution estimation, and a PML-based estimator empirically outperforms the Good-Turing estimator on the actual distribution; $\textbf{Additive property estimation}$ For a broad class of additive properties, the PML plug-in estimator uses just four times the sample size required by the best estimator to achieve roughly twice its error, with exponentially higher confidence; $\boldsymbol{\alpha}\textbf{-R\'enyi entropy estimation}$ For integer $\alpha>1$, the PML plug-in estimator has optimal $k^{1-1/\alpha}$ sample complexity; for non-integer $\alpha>3/4$, the PML plug-in estimator has sample complexity lower than the state of the art; $\textbf{Identity testing}$ In testing whether an unknown distribution is equal to or at least $\varepsilon$ far from a given distribution in $\ell_1$ distance, a PML-based tester achieves the optimal sample complexity up to logarithmic factors of $k$.
no code implementations • NeurIPS 2018 • Yi Hao, Alon Orlitsky, Ananda T. Suresh, Yihong Wu
We design the first unified, linear-time, competitive, property estimator that for a wide class of properties and for all underlying distributions uses just $2n$ samples to achieve the performance attained by the empirical estimator with $n\sqrt{\log n}$ samples.
no code implementations • ICML 2020 • Yi Hao, Alon Orlitsky
For a large variety of distribution properties including four of the most popular ones and for every underlying distribution, they achieve the accuracy that the empirical-frequency plug-in estimators would attain using a logarithmic-factor more samples.
no code implementations • NeurIPS 2018 • Yi Hao, Alon Orlitsky, Venkatadheeraj Pichapati
We consider two problems related to the min-max risk (expected loss) of estimating an unknown $k$-state Markov chain from its $n$ sequential samples: predicting the conditional distribution of the next sample with respect to the KL-divergence, and estimating the transition matrix with respect to a natural loss induced by KL or a more general $f$-divergence measure.
no code implementations • ICML 2018 • Moein Falahatgar, Ayush Jain, Alon Orlitsky, Venkatadheeraj Pichapati, Vaishakh Ravindrakumar
We present a comprehensive understanding of three important problems in PAC preference learning: maximum selection (maxing), ranking, and estimating all pairwise preference probabilities, in the adaptive setting.
no code implementations • NeurIPS 2017 • Moein Falahatgar, Mesrob I. Ohannessian, Alon Orlitsky, Venkatadheeraj Pichapati
Minimax optimality is too pessimistic to remedy this issue.
no code implementations • NeurIPS 2017 • Moein Falahatgar, Yi Hao, Alon Orlitsky, Venkatadheeraj Pichapati, Vaishakh Ravindrakumar
PAC maximum selection (maxing) and ranking of $n$ elements via random pairwise comparisons have diverse applications and have been studied under many models and assumptions.
no code implementations • ICML 2017 • Jayadev Acharya, Hirakendu Das, Alon Orlitsky, Ananda Theertha Suresh
Symmetric distribution properties such as support size, support coverage, entropy, and proximity to uniformity, arise in many applications.
no code implementations • ICML 2017 • Moein Falahatgar, Alon Orlitsky, Venkatadheeraj Pichapati, Ananda Theertha Suresh
We consider $(\epsilon,\delta)$-PAC maximum-selection and ranking for general probabilistic models whose comparisons probabilities satisfy strong stochastic transitivity and stochastic triangle inequality.
no code implementations • NeurIPS 2016 • Moein Falahatgar, Mesrob I. Ohannessian, Alon Orlitsky
Utilizing the structure of a probabilistic model can significantly increase its learning speed.
no code implementations • 9 Nov 2016 • Jayadev Acharya, Hirakendu Das, Alon Orlitsky, Ananda Theertha Suresh
The advent of data science has spurred interest in estimating properties of distributions over large alphabets.
no code implementations • NeurIPS 2015 • Alon Orlitsky, Ananda Theertha Suresh
Second, they estimate every distribution nearly as well as the best estimator designed with prior knowledge of the exact distribution, but as all natural estimators, restricted to assign the same probability to all symbols appearing the same number of times. Specifically, for distributions over $k$ symbols and $n$ samples, we show that for both comparisons, a simple variant of Good-Turing estimator is always within KL divergence of $(3+o(1))/n^{1/3}$ from the best estimator, and that a more involved estimator is within $\tilde{\mathcal{O}}(\min(k/n, 1/\sqrt n))$.
no code implementations • 23 Nov 2015 • Alon Orlitsky, Ananda Theertha Suresh, Yihong Wu
We derive a class of estimators that $\textit{provably}$ predict $U$ not just for constant $t>1$, but all the way up to $t$ proportional to $\log n$.
no code implementations • 16 Apr 2015 • Moein Falahatgar, Ashkan Jafarpour, Alon Orlitsky, Venkatadheeraj Pichapathi, Ananda Theertha Suresh
There has been considerable recent interest in distribution-tests whose run-time and sample requirements are sublinear in the domain-size $k$.
no code implementations • 27 Mar 2015 • Alon Orlitsky, Ananda Theertha Suresh
We also provide an estimator that runs in linear time and incurs competitive regret of $\tilde{\mathcal{O}}(\min(k/n, 1/\sqrt n))$, and show that for natural estimators this competitive regret is inevitable.
no code implementations • 2 Aug 2014 • Jayadev Acharya, Alon Orlitsky, Ananda Theertha Suresh, Himanshu Tyagi
It was recently shown that estimating the Shannon entropy $H({\rm p})$ of a discrete $k$-symbol distribution ${\rm p}$ requires $\Theta(k/\log k)$ samples, a number that grows near-linearly in the support size.
no code implementations • 29 May 2014 • Jayadev Acharya, Ashkan Jafarpour, Alon Orlitsky, Ananda Theertha Suresh
The Poisson-sampling technique eliminates dependencies among symbol appearances in a random sequence.
no code implementations • NeurIPS 2014 • Jayadev Acharya, Ashkan Jafarpour, Alon Orlitsky, Ananda Theertha Suresh
For mixtures of any $k$ $d$-dimensional spherical Gaussians, we derive an intuitive spectral-estimator that uses $\mathcal{O}_k\bigl(\frac{d\log^2d}{\epsilon^4}\bigr)$ samples and runs in time $\mathcal{O}_{k,\epsilon}(d^3\log^5 d)$, both significantly lower than previously known.