no code implementations • 8 Apr 2023 • Alberto Bordino, Stefano Favaro, Sandra Fortini
There is a growing interest on large-width asymptotic properties of Gaussian neural networks (NNs), namely NNs whose weights are initialized according to Gaussian distributions.
no code implementations • 8 Apr 2023 • Alberto Bordino, Stefano Favaro, Sandra Fortini
As a novelty with respect to previous works, our results rely on the use of a generalized central limit theorem for heavy tails distributions, which allows for an interesting unified treatment of infinitely wide limits for deep Stable NNs.
no code implementations • 16 Jun 2022 • Stefano Favaro, Sandra Fortini, Stefano Peluchetti
As a difference with respect to the Gaussian setting, our result shows that the choice of the activation function affects the scaling of the NN, that is: to achieve the infinitely wide $\alpha$-Stable process, the ReLU activation requires an additional logarithmic term in the scaling with respect to sub-linear activations.
no code implementations • 2 Aug 2021 • Stefano Favaro, Sandra Fortini, Stefano Peluchetti
Then, we establish sup-norm convergence rates of the rescaled deep Stable NN to the Stable SP, under the ``joint growth" and a ``sequential growth" of the width over the NN's layers.
no code implementations • ICLR 2021 • Daniele Bracale, Stefano Favaro, Sandra Fortini, Stefano Peluchetti
In this paper, we consider fully connected feed-forward deep neural networks where weights and biases are independent and identically distributed according to Gaussian distributions.
no code implementations • 7 Feb 2021 • Daniele Bracale, Stefano Favaro, Sandra Fortini, Stefano Peluchetti
The interplay between infinite-width neural networks (NNs) and classes of Gaussian processes (GPs) is well known since the seminal work of Neal (1996).
1 code implementation • 1 Mar 2020 • Stefano Favaro, Sandra Fortini, Stefano Peluchetti
We consider fully connected feed-forward deep neural networks (NNs) where weights and biases are independent and identically distributed as symmetric centered stable distributions.