Search Results for author: Tyler B. Johnson

Found 6 papers, 1 papers with code

AdaScale SGD: A User-Friendly Algorithm for Distributed Training

1 code implementation • ICML 2020 • Tyler B. Johnson, Pulkit Agrawal, Haijie Gu, Carlos Guestrin

When using large-batch training to speed up stochastic gradient descent, learning rates must adapt to new batch sizes in order to maximize speed-ups and preserve model quality.

Image Classification Machine Translation +5

401

Paper
Code

AdaScale SGD: A Scale-Invariant Algorithm for Distributed Training

no code implementations • 25 Sep 2019 • Tyler B. Johnson, Pulkit Agrawal, Haijie Gu, Carlos Guestrin

When using distributed training to speed up stochastic gradient descent, learning rates must adapt to new scales in order to maintain training effectiveness.

Image Classification Machine Translation +5

Paper
Add Code

Training Deep Models Faster with Robust, Approximate Importance Sampling

no code implementations • NeurIPS 2018 • Tyler B. Johnson, Carlos Guestrin

In theory, importance sampling speeds up stochastic gradient algorithms for supervised learning by prioritizing training examples.

Paper
Add Code

A Fast, Principled Working Set Algorithm for Exploiting Piecewise Linear Structure in Convex Problems

no code implementations • 20 Jul 2018 • Tyler B. Johnson, Carlos Guestrin

By reducing optimization to a sequence of smaller subproblems, working set algorithms achieve fast convergence times for many machine learning problems.

Paper
Add Code

StingyCD: Safely Avoiding Wasteful Updates in Coordinate Descent

no code implementations • ICML 2017 • Tyler B. Johnson, Carlos Guestrin

Coordinate descent (CD) is a scalable and simple algorithm for solving many optimization problems in machine learning.

Paper
Add Code

Unified Methods for Exploiting Piecewise Linear Structure in Convex Optimization

no code implementations • NeurIPS 2016 • Tyler B. Johnson, Carlos Guestrin

We develop methods for rapidly identifying important components of a convex optimization problem for the purpose of achieving fast convergence times.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.