no code implementations • 14 Jan 2022 • Li Wang, Yingcong Zhou, Zhiguo Fu
In the present paper, we characterize the implicit regularization of momentum gradient descent (MGD) with early stopping by comparing with the explicit $\ell_2$-regularization (ridge).