no code implementations • 21 Sep 2023 • Basile Lewandowski, Atli Kosson
Traditional optimization methods rely on the use of single-precision floating point arithmetic, which can be costly in terms of memory size and computing power.
1 code implementation • NeurIPS 2023 • Atli Kosson, Martin Jaggi
Finally, we show that we can eliminate all multiplications in the entire training process, including operations in the forward pass, backward pass and optimizer update, demonstrating the first successful training of modern neural network architectures in a fully multiplication-free fashion.
2 code implementations • 26 May 2023 • Atli Kosson, Bettina Messmer, Martin Jaggi
This study investigates how weight decay affects the update behavior of individual neurons in deep neural networks through a combination of applied analysis and experimentation.
1 code implementation • 26 May 2023 • Atli Kosson, Dongyang Fan, Martin Jaggi
Batch Normalization (BN) is widely used to stabilize the optimization process and improve the test performance of deep neural networks.
no code implementations • 2 Jul 2020 • Abhinav Venigalla, Atli Kosson, Vitaliy Chiley, Urs Köster
Neural network training is commonly accelerated by using multiple synchronized workers to compute gradient updates in parallel.
no code implementations • 25 Mar 2020 • Atli Kosson, Vitaliy Chiley, Abhinav Venigalla, Joel Hestness, Urs Köster
New hardware can substantially increase the speed and efficiency of deep neural network training.
1 code implementation • NeurIPS 2019 • Vitaliy Chiley, Ilya Sharapov, Atli Kosson, Urs Koster, Ryan Reece, Sofia Samaniego de la Fuente, Vishal Subbiah, Michael James
Online Normalization is a new technique for normalizing the hidden activations of a neural network.