no code implementations • 21 Mar 2023 • Nilin Abrahamsen, Zhiyan Ding, Gil Goldshlager, Lin Lin
We provide theoretical convergence bounds for the variational Monte Carlo (VMC) method as applied to optimize neural network wave functions for the electronic structure problem.
no code implementations • 6 Oct 2021 • Zhiyan Ding, Shi Chen, Qin Li, Stephen Wright
Finding the optimal configuration of parameters in ResNet is a nonconvex minimization problem, but first-order methods nevertheless find the global optimum in the overparameterized regime.
no code implementations • 30 May 2021 • Zhiyan Ding, Shi Chen, Qin Li, Stephen Wright
Finding parameters in a deep neural network (NN) that fit training data is a nonconvex optimization problem, but a basic first-order optimization method (gradient descent) finds a global optimizer with perfect fit (zero-loss) in many practical situations.
no code implementations • 8 Feb 2021 • Zhiyan Ding, Qin Li
In particular, we find that if one directly surrogates the gradient using the ensemble approximation, the algorithm, termed Ensemble Langevin Monte Carlo, is unstable due to a high variance term.
no code implementations • 22 Oct 2020 • Zhiyan Ding, Qin Li, Jianfeng Lu, Stephen J. Wright
We investigate the computational complexity of RC-ULMC and compare it with the classical ULMC for strongly log-concave probability distributions.
no code implementations • 3 Oct 2020 • Zhiyan Ding, Qin Li, Jianfeng Lu, Stephen J. Wright
We investigate the total complexity of RC-LMC and compare it with the classical LMC for log-concave probability distributions.
no code implementations • 26 Jul 2020 • Zhiyan Ding, Qin Li
However, the method requires the evaluation of a full gradient in each iteration, and for a problem on $\mathbb{R}^d$, this amounts to $d$ times partial derivative evaluations per iteration.
no code implementations • NeurIPS 2020 • Zhiyan Ding, Qin Li
The high variance induced by the randomness means a larger number of iterations are needed, and this balances out the saving in each iteration.
no code implementations • 18 Oct 2019 • Zhiyan Ding, Yiding Chen, Qin Li, Xiaojin Zhu
To our knowledge, this is the first analysis for SGD error lower bound without the strong convexity assumption.