no code implementations • 6 Apr 2023 • Luís Carvalho, João Lopes Costa, José Mourão, Gonçalo Oliveira
In particular, using a certain metric in the space of measures we find that, along training, the resulting measure is within $n^{-\frac{1}{2}}(\log n)^{1+}$ of the time dependent Gaussian process corresponding to the infinite width network (which is explicitly given by precomposing the initial Gaussian process with the linear flow corresponding to training in the infinite width limit).