Search Results for author: Tsung-Hsien Lee

Found 1 papers, 1 papers with code

A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale

1 code implementation • 12 Sep 2023 • Hao-Jun Michael Shi, Tsung-Hsien Lee, Shintaro Iwasaki, Jose Gallego-Posada, Zhijing Li, Kaushik Rangadurai, Dheevatsa Mudigere, Michael Rabbat

It constructs a block-diagonal preconditioner where each block consists of a coarse Kronecker product approximation to full-matrix AdaGrad for each parameter of the neural network.

Stochastic Optimization

199

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.