no code implementations • 21 May 2024 • Ilias Diakonikolas, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis
Instead of assuming that the online adversary chooses an arbitrary sequence of labels, we assume that the context $\mathbf{x}$ is selected adversarially but the label $y$ presented to the learner disagrees with the ground-truth label of $\mathbf{x}$ with unknown probability at most $\eta$.
no code implementations • 13 May 2024 • Vasilis Kontonis, Mingchen Ma, Christos Tzamos
We measure the complexity of the region queries via the VC dimension of the family of regions.
no code implementations • 27 Dec 2023 • Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis
In contrast, algorithms that rely only on random examples inherently require $d^{\mathrm{poly}(1/\epsilon)}$ samples and runtime, even for the basic problem of agnostically learning a single ReLU or a halfspace.
no code implementations • 8 Oct 2023 • Constantine Caramanis, Dimitris Fotakis, Alkis Kalavasis, Vasilis Kontonis, Christos Tzamos
Deep Neural Networks and Reinforcement Learning methods have empirically shown great promise in tackling challenging combinatorial problems.
no code implementations • 20 Sep 2023 • Ilias Diakonikolas, Sushrut Karmalkar, Jongho Park, Christos Tzamos
Our goal is to accurately recover a \new{parameter vector $w$ such that the} function $g(w \cdot x)$ \new{has} arbitrarily small error when compared to the true values $g(w^* \cdot x)$, rather than the noisy measurements $y$.
no code implementations • 6 Aug 2023 • Ilias Diakonikolas, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis
In contrast, under a worst- or random-ordering, the number of mistakes must be at least $\Omega(d \log n)$, even when the points are drawn uniformly from the unit sphere and the learner only needs to predict the labels for $1\%$ of them.
no code implementations • 6 Jun 2023 • Mingchen Ma, Christos Tzamos
In this paper, we study how to buy information for stochastic optimization and formulate this question as an online learning problem.
no code implementations • 6 Dec 2022 • Ilias Diakonikolas, Christos Tzamos, Daniel M. Kane
By leveraging our strongly polynomial Forster algorithm, we obtain the first strongly polynomial time algorithm for {\em distribution-free} PAC learning of halfspaces.
no code implementations • 23 Nov 2022 • Dimitris Fotakis, Alkis Kalavasis, Christos Tzamos
We design a Markov chain whose stationary distribution coincides with $\mathcal{D}$ and give an algorithm to obtain exact samples using the technique of Coupling from the Past.
no code implementations • 17 Jun 2022 • Ilias Diakonikolas, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis
For the ReLU activation, we give an efficient algorithm with sample complexity $\tilde{O}(d\, \polylog(1/\epsilon))$.
no code implementations • 9 Jun 2022 • Alberto Del Pia, Mingchen Ma, Christos Tzamos
Our main result is a computationally efficient algorithm that can identify large clusters with $O\left(\frac{nk \log n} {(1-2p)^2}\right) + \text{poly}\left(\log n, k, \frac{1}{1-2p} \right)$ queries, matching the guarantees of the best known algorithms in the fully-random model.
no code implementations • 26 May 2022 • Alexia Atsidakou, Constantine Caramanis, Evangelia Gergatsouli, Orestis Papadigenopoulos, Christos Tzamos
Pandora's Box is a fundamental stochastic optimization problem, where the decision-maker must find a good alternative while minimizing the search cost of exploring the value of each alternative.
no code implementations • 10 Feb 2022 • Evangelia Gergatsouli, Christos Tzamos
In Pandora's Box, we are presented with $n$ boxes, each containing an unknown value and the goal is to open the boxes in some order to minimize the sum of the search cost and the smallest value found.
no code implementations • NeurIPS 2021 • Ilias Diakonikolas, Jongho Park, Christos Tzamos
This supervised learning task is efficiently solvable in the realizable setting, but is known to be computationally hard with adversarial label noise.
no code implementations • 30 Aug 2021 • Shuchi Chawla, Evangelia Gergatsouli, Jeremy McMahan, Christos Tzamos
For distributions of support $m$, UDT admits a $\log m$ approximation, and while a constant factor approximation in polynomial time is a long-standing open problem, constant factor approximations are achievable in subexponential time (arXiv:1906. 11385).
no code implementations • 22 Aug 2021 • Dimitris Fotakis, Alkis Kalavasis, Vasilis Kontonis, Christos Tzamos
Our main algorithmic result is that essentially any problem learnable from fine grained labels can also be learned efficiently when the coarse data are sufficiently informative.
no code implementations • 19 Aug 2021 • Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis
We study the general problem and establish the following: For $\eta <1/2$, we give a learning algorithm for general halfspaces with sample and computational complexity $d^{O_{\eta}(\log(1/\gamma))}\mathrm{poly}(1/\epsilon)$, where $\gamma =\max\{\epsilon, \min\{\mathbf{Pr}[f(\mathbf{x}) = 1], \mathbf{Pr}[f(\mathbf{x}) = -1]\} \}$ is the bias of the target halfspace $f$.
no code implementations • NeurIPS 2021 • Ilias Diakonikolas, Daniel M. Kane, Christos Tzamos
A Forster transform is an operation that turns a distribution into one with good anti-concentration properties.
no code implementations • 14 Jun 2021 • Ilias Diakonikolas, Russell Impagliazzo, Daniel Kane, Rex Lei, Jessica Sorrell, Christos Tzamos
Our upper and lower bounds characterize the complexity of boosting in the distribution-independent PAC model with Massart noise.
no code implementations • 10 Feb 2021 • Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis
We study the problem of agnostically learning halfspaces under the Gaussian distribution.
no code implementations • 1 Dec 2020 • Vasilis Kontonis, Sihan Liu, Christos Tzamos
Our main result is that by training the Generator together with a Discriminator according to the Stochastic Gradient Descent-Ascent iteration proposed by Goodfellow et al. yields a Generator distribution that approaches the target distribution of $f_*$.
no code implementations • 22 Oct 2020 • Constantinos Daskalakis, Themis Gouleakis, Christos Tzamos, Manolis Zampetakis
We provide a computationally and statistically efficient estimator for the classical problem of truncated linear regression, where the dependent variable $y = w^T x + \epsilon$ and its corresponding vector of covariates $x \in R^k$ are only revealed if the dependent variable falls in some subset $S \subseteq R$; otherwise the existence of the pair $(x, y)$ is hidden.
no code implementations • 4 Oct 2020 • Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis
{\em We give the first polynomial-time algorithm for this fundamental learning problem.}
no code implementations • 5 Jul 2020 • Dimitris Fotakis, Alkis Kalavasis, Christos Tzamos
A stunning consequence is that virtually any statistical task (e. g., learning in total variation distance, parameter estimation, uniformity or identity testing) that can be performed efficiently for Boolean product distributions, can also be performed from truncated samples, with a small increase in sample complexity.
no code implementations • 11 Jun 2020 • Ilias Diakonikolas, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis
In the Tsybakov noise model, each label is independently flipped with some probability which is controlled by an adversary.
no code implementations • NeurIPS 2020 • Ilias Diakonikolas, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis
We study the problem of agnostically learning homogeneous halfspaces in the distribution-specific PAC model.
no code implementations • ICML 2020 • Evangelia Gergatsouli, Brendan Lucier, Christos Tzamos
In this work we develop algorithms that are able to restore monotonicity in the parameters of interest.
no code implementations • 13 Feb 2020 • Ilias Diakonikolas, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis
We study the problem of learning halfspaces with Massart noise in the distribution-specific PAC model.
no code implementations • 10 Feb 2020 • Zifan Liu, Jongho Park, Theodoros Rekatsinas, Christos Tzamos
We study the problem of robust mean estimation and introduce a novel Hamming distance-based measure of distribution shift for coordinate-level corruptions.
no code implementations • 2 Aug 2019 • Vasilis Kontonis, Christos Tzamos, Manolis Zampetakis
Our main result is a computationally and sample efficient algorithm for estimating the parameters of the Gaussian under arbitrary unknown truncation sets whose performance decays with a natural measure of complexity of the set, namely its Gaussian surface area.
no code implementations • NeurIPS 2019 • Ilias Diakonikolas, Themis Gouleakis, Christos Tzamos
The goal is to find a hypothesis $h$ that minimizes the misclassification error $\mathbf{Pr}_{(\mathbf{x}, y) \sim \mathcal{D}} \left[ h(\mathbf{x}) \neq y \right]$.
no code implementations • 26 Apr 2019 • Daniel Alabi, Adam Tauman Kalai, Katrina Ligett, Cameron Musco, Christos Tzamos, Ellen Vitercik
We present an algorithm that learns to maximally prune the search space on repeated computations, thereby reducing runtime while provably outputting the correct solution each period with high probability.
no code implementations • 11 Sep 2018 • Constantinos Daskalakis, Themis Gouleakis, Christos Tzamos, Manolis Zampetakis
We provide an efficient algorithm for the classical problem, going back to Galton, Pearson, and Fisher, of estimating, with arbitrary accuracy the parameters of a multivariate normal distribution from truncated samples.
no code implementations • 17 Jul 2018 • Gautam Kamath, Christos Tzamos
This is an exponential improvement over the previous best upper bound, and demonstrates that the complexity of the problem in this model is intermediate to the the complexity of the problem in the standard sampling model and the adaptive conditional sampling model.
no code implementations • 20 Feb 2018 • Steve Hanneke, Adam Kalai, Gautam Kamath, Christos Tzamos
A generative model may generate utter nonsense when it is fit to maximize the likelihood of observed data.
no code implementations • ICML 2017 • Arturs Backurs, Christos Tzamos
The classic algorithm of Viterbi computes the most likely path in a Hidden Markov Model (HMM) that results in a given sequence of observations.
no code implementations • 23 Feb 2017 • Constantinos Daskalakis, Christos Tzamos, Manolis Zampetakis
Our first result is a strong converse of Banach's theorem, showing that it is a universal analysis tool for establishing global convergence of iterative methods to unique fixed points, and for bounding their convergence rate.
no code implementations • 2 Jan 2017 • Iddan Golomb, Christos Tzamos
We address the problem of locating facilities on the $[0, 1]$ interval based on reports from strategic agents.
no code implementations • 1 Sep 2016 • Constantinos Daskalakis, Christos Tzamos, Manolis Zampetakis
In the finite sample regime, we show that, under a random initialization, $\tilde{O}(d/\epsilon^2)$ samples suffice to compute the unknown vectors to within $\epsilon$ in Mahalanobis distance, where $d$ is the dimension.
no code implementations • 16 Aug 2016 • Themistoklis Gouleakis, Christos Tzamos, Manolis Zampetakis
In contrast to prior algorithms for the classic model, our algorithms have time, space and sample complexity that is polynomial in the dimension and polylogarithmic in the number of points.
no code implementations • 11 Nov 2015 • Constantinos Daskalakis, Anindya De, Gautam Kamath, Christos Tzamos
Finally, leveraging the structural properties of the Fourier spectrum of PMDs we show that these distributions can be learned from $O_k(1/\varepsilon^2)$ samples in ${\rm poly}_k(1/\varepsilon)$-time, removing the quasi-polynomial dependence of the running time on $1/\varepsilon$ from the algorithm of Daskalakis, Kamath, and Tzamos.
no code implementations • 30 Apr 2015 • Constantinos Daskalakis, Gautam Kamath, Christos Tzamos
We prove a structural characterization of these distributions, showing that, for all $\varepsilon >0$, any $(n, k)$-Poisson multinomial random vector is $\varepsilon$-close, in total variation distance, to the sum of a discretized multidimensional Gaussian and an independent $(\text{poly}(k/\varepsilon), k)$-Poisson multinomial random vector.