Search Results for author: Hao-Jun Michael Shi

Found 7 papers, 4 papers with code

A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale

1 code implementation12 Sep 2023 Hao-Jun Michael Shi, Tsung-Hsien Lee, Shintaro Iwasaki, Jose Gallego-Posada, Zhijing Li, Kaushik Rangadurai, Dheevatsa Mudigere, Michael Rabbat

It constructs a block-diagonal preconditioner where each block consists of a coarse Kronecker product approximation to full-matrix AdaGrad for each parameter of the neural network.

Stochastic Optimization

A Noise-Tolerant Quasi-Newton Algorithm for Unconstrained Optimization

1 code implementation9 Oct 2020 Hao-Jun Michael Shi, Yuchen Xie, Richard Byrd, Jorge Nocedal

This paper describes an extension of the BFGS and L-BFGS methods for the minimization of a nonlinear function subject to errors.

Optimization and Control

Compositional Embeddings Using Complementary Partitions for Memory-Efficient Recommendation Systems

6 code implementations4 Sep 2019 Hao-Jun Michael Shi, Dheevatsa Mudigere, Maxim Naumov, Jiyan Yang

We propose a novel approach for reducing the embedding size in an end-to-end fashion by exploiting complementary partitions of the category set to produce a unique embedding vector for each category without explicit definition.

Recommendation Systems

A Progressive Batching L-BFGS Method for Machine Learning

no code implementations ICML 2018 Raghu Bollapragada, Dheevatsa Mudigere, Jorge Nocedal, Hao-Jun Michael Shi, Ping Tak Peter Tang

The standard L-BFGS method relies on gradient approximations that are not dominated by noise, so that search directions are descent directions, the line search is reliable, and quasi-Newton updating yields useful quadratic models of the objective function.

BIG-bench Machine Learning

A Primer on Coordinate Descent Algorithms

no code implementations30 Sep 2016 Hao-Jun Michael Shi, Shenyinying Tu, Yangyang Xu, Wotao Yin

This monograph presents a class of algorithms called coordinate descent algorithms for mathematicians, statisticians, and engineers outside the field of optimization.

BIG-bench Machine Learning Distributed Computing

Practical Algorithms for Learning Near-Isometric Linear Embeddings

no code implementations1 Jan 2016 Jerry Luo, Kayla Shapiro, Hao-Jun Michael Shi, Qi Yang, Kan Zhu

Motivated by non-negative matrix factorization, we reformulate our problem into a Frobenius norm minimization problem, which is solved by the Alternating Direction Method of Multipliers (ADMM) and develop an algorithm, FroMax.

Dimensionality Reduction

Cannot find the paper you are looking for? You can Submit a new open access paper.