Search Results for author: Achraf Bahamou

Found 7 papers, 1 papers with code

Layer-wise Adaptive Step-Sizes for Stochastic First-Order Methods for Deep Learning

no code implementations23 May 2023 Achraf Bahamou, Donald Goldfarb

We propose a new per-layer adaptive step-size procedure for stochastic first-order optimization methods for minimizing empirical loss functions in deep learning, eliminating the need for the user to tune the learning rate (LR).

A Mini-Block Fisher Method for Deep Neural Networks

no code implementations8 Feb 2022 Achraf Bahamou, Donald Goldfarb, Yi Ren

Specifically, our method uses a block-diagonal approximation to the empirical Fisher matrix, where for each layer in the DNN, whether it is convolutional or feed-forward and fully connected, the associated diagonal block is itself block-diagonal and is composed of a large number of mini-blocks of modest size.

Second-order methods

Optimal Pricing with a Single Point

no code implementations9 Mar 2021 Amine Allouah, Achraf Bahamou, Omar Besbes

For settings where the seller knows the exact probability of sale associated with one historical price or only a confidence interval for it, we fully characterize optimal performance and near-optimal pricing algorithms that adjust to the information at hand.

Computer Science and Game Theory Information Theory Information Theory

Kronecker-factored Quasi-Newton Methods for Deep Learning

no code implementations12 Feb 2021 Yi Ren, Achraf Bahamou, Donald Goldfarb

Several improvements to the methods in Goldfarb et al. (2020) are also proposed that can be applied to both MLPs and CNNs.

Second-order methods

Practical Quasi-Newton Methods for Training Deep Neural Networks

1 code implementation NeurIPS 2020 Donald Goldfarb, Yi Ren, Achraf Bahamou

We consider the development of practical stochastic quasi-Newton, and in particular Kronecker-factored block-diagonal BFGS and L-BFGS methods, for training deep neural networks (DNNs).

Stochastic Flows and Geometric Optimization on the Orthogonal Group

no code implementations ICML 2020 Krzysztof Choromanski, David Cheikhi, Jared Davis, Valerii Likhosherstov, Achille Nazaret, Achraf Bahamou, Xingyou Song, Mrugank Akarte, Jack Parker-Holder, Jacob Bergquist, Yuan Gao, Aldo Pacchiano, Tamas Sarlos, Adrian Weller, Vikas Sindhwani

We present a new class of stochastic, geometrically-driven optimization algorithms on the orthogonal group $O(d)$ and naturally reductive homogeneous manifolds obtained from the action of the rotation group $SO(d)$.

Metric Learning Stochastic Optimization

A Dynamic Sampling Adaptive-SGD Method for Machine Learning

no code implementations31 Dec 2019 Achraf Bahamou, Donald Goldfarb

We also propose an adaptive version of ADAM that eliminates the need to tune the base learning rate and compares favorably to fine-tuned ADAM on training DNNs.

BIG-bench Machine Learning Stochastic Optimization

Cannot find the paper you are looking for? You can Submit a new open access paper.