CDF Normalization for Controlling the Distribution of Hidden Nodes

NeurIPS Workshop ICBINB 2021 · Mike Van Ness, Madeleine Udell ·

Batch Normalizaiton (BN) is a normalization method for deep neural networks that has been shown to accelerate training. While the effectiveness of BN is undisputed, the explanation of its effectiveness is still being studied. The original BN paper attributes the success of BN to reducing internal covariate shift, so we take this a step further and explicitly enforce a Gaussian distribution on hidden layer activations. This approach proves to be ineffective, demonstrating that reducing internal covariate shift in and of itself does not help training.

PDF Abstract