no code implementations • 26 Sep 2023 • Margalit Glasgow
To our knowledge, this work is the first to give a sample complexity of $\tilde{O}(d)$ for efficiently learning the XOR function on isotropic data on a standard neural network with standard training.
no code implementations • 6 Mar 2023 • Margalit Glasgow, Alexander Rakhlin
Our lower bound shows that the $\gamma$-DEC is a fundamental limit for any model class $\mathcal{F}$: for any algorithm, there exists some $f \in \mathcal{F}$ for which the $\gamma$-regret of that algorithm scales (nearly) with the $\gamma$-DEC of $\mathcal{F}$.
no code implementations • 16 Jun 2022 • Margalit Glasgow, Colin Wei, Mary Wootters, Tengyu Ma
Nagarajan and Kolter (2019) show that in certain simple linear and neural-network settings, any uniform convergence bound will be vacuous, leaving open the question of how to prove generalization in settings where UC fails.
1 code implementation • 5 Nov 2021 • Margalit Glasgow, Honglin Yuan, Tengyu Ma
In this work, we first resolve this question by providing a lower bound for FedAvg that matches the existing upper bound, which shows the existing FedAvg upper bound analysis is not improvable.
no code implementations • 22 Sep 2020 • Margalit Glasgow, Mary Wootters
This complexity sits squarely between the complexity $\tilde{O}\left(\left(n + \kappa\right)\log(1/\epsilon)\right)$ of SAGA \textit{without delays} and the complexity $\tilde{O}\left(\left(n + m\kappa\right)\log(1/\epsilon)\right)$ of parallel asynchronous algorithms where the delays are \textit{arbitrary} (but bounded by $O(m)$), and the data is accessible by all.
no code implementations • 17 Jun 2020 • Margalit Glasgow, Mary Wootters
Recent work has studied approximate gradient coding, which concerns coding schemes where the replication factor of the data is too low to recover the full gradient exactly.