What Breaks The Curse of Dimensionality in Deep Learning?

NeurIPS 2021 · Lechao Xiao, Jeffrey Pennington ·

Although learning in high dimensions is commonly believed to suffer from the curse of dimensionality, modern machine learning methods often exhibit an astonishing power to tackle a wide range of challenging real-world learning problems without using abundant amounts of data. How exactly these methods break this curse remains a fundamental open question in the theory of deep learning. While previous efforts have investigated this question by studying the data (D), model (M), and inference algorithm (I) as independent modules, in this paper we analyzes the triple (D, M, I) as an integrated system. We examine the basic symmetries of such systems associated to four of the main architectures in deep learning: fully-connected networks (FCN), locally-connected networks (LCN), and convolutional networks with and without pooling (GAP/VEC). By computing an eigen-decomposition of the infinite-width limits (aka Neural Kernels) of these architectures, we characterize how inductive biases (locality, weight-sharing, pooling, etc) and the breaking of spurious symmetries can affect the performance of these learning systems. Our theoretical analysis shows that for many real-world tasks it is locality rather than symmetry that provides the first-order remedy to the curse of dimensionality. Empirical results on state-of-the-art models on ImageNet corroborate our results.

PDF Abstract