no code implementations • 25 May 2021 • George Philipp
We argue that the NLC is the most powerful scalar statistic for architecture design specifically and neural network analysis in general.
no code implementations • ICLR 2019 • George Philipp, Jaime G. Carbonell
Via an extensive empirical study, we show that the NLC is a powerful predictor of test error and that attaining a right-sized NLC is essential for optimal performance.
no code implementations • ICLR 2018 • George Philipp, Dawn Song, Jaime G. Carbonell
Whereas it is believed that techniques such as Adam, batch normalization and, more recently, SeLU nonlinearities ``solve'' the exploding gradient problem, we show that this is not the case and that in a range of popular MLP architectures, exploding gradients exist and that they limit the depth to which networks can be effectively trained, both in theory and in practice.
no code implementations • 15 Dec 2017 • George Philipp, Dawn Song, Jaime G. Carbonell
Whereas it is believed that techniques such as Adam, batch normalization and, more recently, SeLU nonlinearities "solve" the exploding gradient problem, we show that this is not the case in general and that in a range of popular MLP architectures, exploding gradients exist and that they limit the depth to which networks can be effectively trained, both in theory and in practice.
no code implementations • 14 Dec 2017 • George Philipp, Jaime G. Carbonell
Automatically determining the optimal size of a neural network for a given task without prior information currently requires an expensive global search and training many networks from scratch.
no code implementations • 13 Dec 2017 • George Philipp, Seunghak Lee, Eric P. Xing
Recently, a meta-algorithm called Stability Selection was proposed that can provide reliable finite-sample control of the number of false positives.