Natural Gradient Descent

Natural Gradient Descent is an approximate second-order optimisation method. It has an interpretation as optimizing over a Riemannian manifold using an intrinsic distance metric, which implies the updates are invariant to transformations such as whitening. By using the positive semi-definite (PSD) Gauss-Newton matrix to approximate the (possibly negative definite) Hessian, NGD can often work better than exact second-order methods.

Given the gradient of $z$, $g = \frac{\delta{f}\left(z\right)}{\delta{z}}$, NGD computes the update as:

$$\Delta{z} = \alpha{F}^{−1}g$$

where the Fisher information matrix $F$ is defined as:

$$ F = \mathbb{E}_{p\left(t\mid{z}\right)}\left[\nabla\ln{p}\left(t\mid{z}\right)\nabla\ln{p}\left(t\mid{z}\right)^{T}\right] $$

The log-likelihood function $\ln{p}\left(t\mid{z}\right)$ typically corresponds to commonly used error functions such as the cross entropy loss.

Source: LOGAN

Image: Fast Convergence of Natural Gradient Descent for Overparameterized Neural Networks

Latest Papers

PAPER DATE
When Does Preconditioning Help or Hurt Generalization?
Shun-ichi AmariJimmy BaRoger GrosseXuechen LiAtsushi NitandaTaiji SuzukiDenny WuJi Xu
2020-06-18
Mirrorless Mirror Descent: A More Natural Discretization of Riemannian Gradient Flow
Suriya GunasekarBlake WoodworthNathan Srebro
2020-04-02
Towards Query-Efficient Black-Box Adversary with Zeroth-Order Natural Gradient Descent
Pu ZhaoPin-Yu ChenSiyue WangXue Lin
2020-02-18
Scalable and Practical Natural Gradient for Large-Scale Deep Learning
Kazuki OsawaYohei TsujiYuichiro UenoAkira NaruseChuan-Sheng FooRio Yokota
2020-02-13
Multivariate Gaussian Variational Inference by Natural Gradient Descent
Timothy D. Barfoot
2020-01-27
Dual Stochastic Natural Gradient Descent
Borja Sánchez-LópezJesús Cerquides
2020-01-19
EXACT ANALYSIS OF CURVATURE CORRECTED LEARNING DYNAMICS IN DEEP LINEAR NETWORKS
Anonymous
2020-01-01
Implicit regularization and momentum algorithms in nonlinear adaptive control and prediction
Nicholas M. BoffiJean-Jacques E. Slotine
2019-12-31
LOGAN: Latent Optimisation for Generative Adversarial Networks
| Yan WuJeff DonahueDavid BalduzziKaren SimonyanTimothy Lillicrap
2019-12-02
Fast Convergence of Natural Gradient Descent for Over-Parameterized Neural Networks
Guodong ZhangJames MartensRoger B. Grosse
2019-12-01
Quantum Natural Gradient
| James StokesJosh IzaacNathan KilloranGiuseppe Carleo
2019-09-04
Hopfield Neural Network Flow: A Geometric Viewpoint
Abhishek HalderKenneth F. CaluyaBertrand TravaccaScott J. Moura
2019-08-04

Tasks

TASK PAPERS SHARE
Image Classification 2 40.00%
Adversarial Attack 1 20.00%
Conditional Image Generation 1 20.00%
Image Generation 1 20.00%

Components

COMPONENT TYPE
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories