Search Results for author: Haijie Gu

Found 2 papers, 1 papers with code

AdaScale SGD: A User-Friendly Algorithm for Distributed Training

1 code implementation ICML 2020 Tyler B. Johnson, Pulkit Agrawal, Haijie Gu, Carlos Guestrin

When using large-batch training to speed up stochastic gradient descent, learning rates must adapt to new batch sizes in order to maximize speed-ups and preserve model quality.

Image Classification Machine Translation +5

AdaScale SGD: A Scale-Invariant Algorithm for Distributed Training

no code implementations25 Sep 2019 Tyler B. Johnson, Pulkit Agrawal, Haijie Gu, Carlos Guestrin

When using distributed training to speed up stochastic gradient descent, learning rates must adapt to new scales in order to maintain training effectiveness.

Image Classification Machine Translation +5

Cannot find the paper you are looking for? You can Submit a new open access paper.