no code implementations • 29 Apr 2024 • Khashayar Gatmiry, Jonathan Kelner, Holden Lee
We give a new algorithm for learning mixtures of $k$ Gaussians (with identity covariance in $\mathbb{R}^n$) to TV error $\varepsilon$, with quasi-polynomial ($O(n^{\text{poly log}\left(\frac{n+k}{\varepsilon}\right)})$) time and sample complexity, under a minimum weight assumption.
no code implementations • 23 Feb 2024 • Jonathan Kelner, Frederic Koehler, Raghu Meka, Dhruv Rohatgi
It is well-known that the statistical performance of Lasso can suffer significantly when the covariates of interest have strong correlations.
no code implementations • 1 Mar 2023 • Khashayar Gatmiry, Jonathan Kelner, Santosh S. Vempala
We introduce a hybrid of the Lewis weights barrier and the standard logarithmic barrier and prove that the mixing rate for the corresponding RHMC is bounded by $\tilde O(m^{1/3}n^{4/3})$, improving on the previous best bound of $\tilde O(mn^{2/3})$ (based on the log barrier).
no code implementations • 4 Nov 2021 • Jonathan Kelner, Annie Marsden, Vatsal Sharan, Aaron Sidford, Gregory Valiant, Honglin Yuan
We consider the problem of minimizing a function $f : \mathbb{R}^d \rightarrow \mathbb{R}$ which is implicitly decomposable as the sum of $m$ unknown non-interacting smooth, strongly convex functions and provide a method which solves this problem with a number of gradient evaluations that scales (up to logarithmic factors) as the product of the square-root of the condition numbers of the components.
no code implementations • ICLR 2022 • Khashayar Gatmiry, Stefanie Jegelka, Jonathan Kelner
While there has been substantial recent work studying generalization of neural networks, the ability of deep nets in automating the process of feature extraction still evades a thorough mathematical understanding.
no code implementations • 17 Jun 2021 • Jonathan Kelner, Frederic Koehler, Raghu Meka, Dhruv Rohatgi
First, we show that the preconditioned Lasso can solve a large class of sparse linear regression problems nearly optimally: it succeeds whenever the dependency structure of the covariates, in the sense of the Markov property, has low treewidth -- even if $\Sigma$ is highly ill-conditioned.
no code implementations • NeurIPS 2020 • Jonathan Kelner, Frederic Koehler, Raghu Meka, Ankur Moitra
While there are a variety of algorithms (e. g. Graphical Lasso, CLIME) that provably recover the graph structure with a logarithmic number of samples, they assume various conditions that require the precision matrix to be in some sense well-conditioned.
no code implementations • 23 Dec 2013 • Boaz Barak, Jonathan Kelner, David Steurer
Aside from being a natural relaxation, this is also motivated by a connection to the Small Set Expansion problem shown by Barak et al. (STOC 2012) and our results yield a certain improvement for that problem.