1 code implementation • ICML 2020 • Tyler B. Johnson, Pulkit Agrawal, Haijie Gu, Carlos Guestrin
When using large-batch training to speed up stochastic gradient descent, learning rates must adapt to new batch sizes in order to maximize speed-ups and preserve model quality.
no code implementations • 25 Sep 2019 • Tyler B. Johnson, Pulkit Agrawal, Haijie Gu, Carlos Guestrin
When using distributed training to speed up stochastic gradient descent, learning rates must adapt to new scales in order to maintain training effectiveness.
no code implementations • NeurIPS 2018 • Tyler B. Johnson, Carlos Guestrin
In theory, importance sampling speeds up stochastic gradient algorithms for supervised learning by prioritizing training examples.
no code implementations • 20 Jul 2018 • Tyler B. Johnson, Carlos Guestrin
By reducing optimization to a sequence of smaller subproblems, working set algorithms achieve fast convergence times for many machine learning problems.
no code implementations • ICML 2017 • Tyler B. Johnson, Carlos Guestrin
Coordinate descent (CD) is a scalable and simple algorithm for solving many optimization problems in machine learning.
no code implementations • NeurIPS 2016 • Tyler B. Johnson, Carlos Guestrin
We develop methods for rapidly identifying important components of a convex optimization problem for the purpose of achieving fast convergence times.