Factor Normalization for Deep Neural Network Models
Deep neural network (DNN) models often involve features of ultrahigh dimensions. In most cases, the ultrahigh dimensional features can be decomposed into two parts. The first part is a low-dimensional factor model. The second part is the residual feature, with much-reduced variability and interfeature correlation. This leads to a number of interesting theoretical findings for deep neural network training. Accordingly, we are inspired to develop a new factor normalization method for fast deep neural network training. The proposed method contains three interesting steps. First, it decomposes the ultrahigh dimensional features into two parts. One is due to factors and the other one is due to residuals. Second, it modifies a given DNN model slightly so that the effect of the latent factor and residual feature can be processed separately. Last, to train a modified DNN model, a new SGD algorithm is developed that allows adaptive learning rates for the factor part and the residual part. A number of empirical experiments are presented to demonstrate its superior performance. The code is available at https://github.com/HazardNeo4869/FactorNormalization
PDF Abstract