Search Results for author: Haijie Gu

Found 2 papers, 1 papers with code

AdaScale SGD: A User-Friendly Algorithm for Distributed Training

1 code implementation • ICML 2020 • Tyler B. Johnson, Pulkit Agrawal, Haijie Gu, Carlos Guestrin

When using large-batch training to speed up stochastic gradient descent, learning rates must adapt to new batch sizes in order to maximize speed-ups and preserve model quality.

Image Classification Machine Translation +5

401

Paper
Code

AdaScale SGD: A Scale-Invariant Algorithm for Distributed Training

no code implementations • 25 Sep 2019 • Tyler B. Johnson, Pulkit Agrawal, Haijie Gu, Carlos Guestrin

When using distributed training to speed up stochastic gradient descent, learning rates must adapt to new scales in order to maintain training effectiveness.

Image Classification Machine Translation +5

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.