no code implementations • 16 Jun 2022 • Margalit Glasgow, Colin Wei, Mary Wootters, Tengyu Ma
Nagarajan and Kolter (2019) show that in certain simple linear and neural-network settings, any uniform convergence bound will be vacuous, leaving open the question of how to prove generalization in settings where UC fails.
no code implementations • 6 Apr 2022 • Jeff Z. HaoChen, Colin Wei, Ananya Kumar, Tengyu Ma
In particular, a linear classifier trained to separate the representations on the source domain can also predict classes on the target domain accurately, even though the representations of the two domains are far from each other.
no code implementations • 29 Sep 2021 • Colin Wei, Yining Chen, Tengyu Ma
A common lens to theoretically study neural net architectures is to analyze the functions they can approximate.
no code implementations • ICLR 2022 • Colin Wei, J Zico Kolter
Our key insights are that these interval bounds can be obtained as the fixed-point solution to an IBP-inspired equilibrium equation, and furthermore, that this solution always exists and is unique when the layer obeys a certain parameterization.
no code implementations • 28 Jul 2021 • Colin Wei, Yining Chen, Tengyu Ma
A common lens to theoretically study neural net architectures is to analyze the functions they can approximate.
1 code implementation • NeurIPS 2021 • Colin Wei, Sang Michael Xie, Tengyu Ma
The generative model in our analysis is either a Hidden Markov Model (HMM) or an HMM augmented with a latent memory component, motivated by long-term dependencies in natural language.
1 code implementation • NeurIPS 2021 • Jeff Z. HaoChen, Colin Wei, Adrien Gaidon, Tengyu Ma
Despite the empirical successes, theoretical foundations are limited -- prior analyses assume conditional independence of the positive pairs given the same class label, but recent empirical applications use heavily correlated positive pairs (i. e., data augmentations of the same image).
no code implementations • 3 Nov 2020 • Hong Liu, Jeff Z. HaoChen, Colin Wei, Tengyu Ma
Recent works found that fine-tuning and joint training---two popular approaches for transfer learning---do not always improve accuracy on downstream tasks.
no code implementations • ICLR 2021 • Colin Wei, Kendrick Shen, Yining Chen, Tengyu Ma
Self-training algorithms, which train a model to fit pseudolabels predicted by another previously-learned model, have been very successful for learning with unlabeled data using neural networks.
no code implementations • NeurIPS 2020 • Yining Chen, Colin Wei, Ananya Kumar, Tengyu Ma
In unsupervised domain adaptation, existing theory focuses on situations where the source and target domains are close.
1 code implementation • 15 Jun 2020 • Jeff Z. HaoChen, Colin Wei, Jason D. Lee, Tengyu Ma
We show that in an over-parameterized setting, SGD with label noise recovers the sparse ground-truth with an arbitrary initialization, whereas SGD with Gaussian noise or gradient descent overfits to dense solutions with large norms.
no code implementations • ICLR 2020 • Colin Wei, Tengyu Ma
For linear classifiers, the relationship between (normalized) output margin and generalization is captured in a clear and simple bound – a large output margin implies good generalization.
1 code implementation • ICML 2020 • Colin Wei, Sham Kakade, Tengyu Ma
This implicit regularization effect is analogous to the effect of stochasticity in small mini-batch stochastic gradient descent.
1 code implementation • 9 Oct 2019 • Colin Wei, Tengyu Ma
Unfortunately, for deep models, this relationship is less clear: existing analyses of the output margin give complicated bounds which sometimes depend exponentially on depth.
2 code implementations • NeurIPS 2019 • Yuanzhi Li, Colin Wei, Tengyu Ma
This concept translates to a larger-scale setting: we demonstrate that one can add a small patch to CIFAR-10 images that is immediately memorizable by a model with small initial learning rate, but ignored by the model with large learning rate until after annealing.
7 code implementations • NeurIPS 2019 • Kaidi Cao, Colin Wei, Adrien Gaidon, Nikos Arechiga, Tengyu Ma
Deep learning algorithms can fare poorly when the training dataset suffers from heavy class-imbalance but the testing criterion requires good generalization on less frequent classes.
Ranked #4 on Long-tail learning with class descriptors on CUB-LT
1 code implementation • NeurIPS 2019 • Colin Wei, Tengyu Ma
For feedforward neural nets as well as RNNs, we obtain tighter Rademacher complexity bounds by considering additional data-dependent properties of the network: the norms of the hidden layers of the network, and the norms of the Jacobians of each layer with respect to all previous layers.
no code implementations • ICLR 2019 • Colin Wei, Jason Lee, Qiang Liu, Tengyu Ma
We establish: 1) for multi-layer feedforward relu networks, the global minimizer of a weakly-regularized cross-entropy loss has the maximum normalized margin among all networks, 2) as a result, increasing the over-parametrization improves the normalized margin and generalization error bounds for deep networks.
no code implementations • NeurIPS 2019 • Colin Wei, Jason D. Lee, Qiang Liu, Tengyu Ma
We prove that for infinite-width two-layer nets, noisy gradient descent optimizes the regularized neural net loss to a global minimum in polynomial iterations.
no code implementations • 15 Oct 2016 • Colin Wei, Iain Murray
Computing partition functions, the normalizing constants of probability distributions, is often hard.