Search Results for author: Atli Kosson

Found 7 papers, 4 papers with code

Memory Efficient Mixed-Precision Optimizers

no code implementations21 Sep 2023 Basile Lewandowski, Atli Kosson

Traditional optimization methods rely on the use of single-precision floating point arithmetic, which can be costly in terms of memory size and computing power.

Multiplication-Free Transformer Training via Piecewise Affine Operations

1 code implementation NeurIPS 2023 Atli Kosson, Martin Jaggi

Finally, we show that we can eliminate all multiplications in the entire training process, including operations in the forward pass, backward pass and optimizer update, demonstrating the first successful training of modern neural network architectures in a fully multiplication-free fashion.

Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks

2 code implementations26 May 2023 Atli Kosson, Bettina Messmer, Martin Jaggi

This study investigates how weight decay affects the update behavior of individual neurons in deep neural networks through a combination of applied analysis and experimentation.

L2 Regularization

Ghost Noise for Regularizing Deep Neural Networks

1 code implementation26 May 2023 Atli Kosson, Dongyang Fan, Martin Jaggi

Batch Normalization (BN) is widely used to stabilize the optimization process and improve the test performance of deep neural networks.

Adaptive Braking for Mitigating Gradient Delay

no code implementations2 Jul 2020 Abhinav Venigalla, Atli Kosson, Vitaliy Chiley, Urs Köster

Neural network training is commonly accelerated by using multiple synchronized workers to compute gradient updates in parallel.

Cannot find the paper you are looking for? You can Submit a new open access paper.