L2 Regularization

28 papers with code • 0 benchmarks • 0 datasets

$L_{2}$ Regularization or Weight Decay, is a regularization technique applied to the weights of a neural network. We minimize a loss function compromising both the primary loss function and a penalty on the $L_{2}$ Norm of the weights:

$$L_{new}\left(w\right) = L_{original}\left(w\right) + \lambda{w^{T}w}$$

where $\lambda$ is a value determining the strength of the penalty (encouraging smaller weights).

Weight decay can be incorporated directly into the weight update rule, rather than just implicitly by defining it through to objective function. Often weight decay refers to the implementation where we specify it directly in the weight update rule (whereas L2 regularization is usually the implementation which is specified in the objective function).

Benchmarks

Add a Result

These leaderboards are used to track progress in L2 Regularization

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Latest papers with no code

Most implemented Social Latest No code

Linking Neural Collapse and L2 Normalization with Improved Out-of-Distribution Detection in Deep Neural Networks

no code yet • 17 Sep 2022

We propose a simple modification to standard ResNet architectures--L2 normalization over feature space--that substantially improves out-of-distribution (OoD) performance on the previously proposed Deep Deterministic Uncertainty (DDU) benchmark.

Paper
Add Code

On the utility and protection of optimization with differential privacy and classic regularization techniques

no code yet • 7 Sep 2022

According to the literature, this approach has proven to be a successful defence against several models' privacy attacks, but its downside is a substantial degradation of the models' performance.

Paper
Add Code

Perturbation of Deep Autoencoder Weights for Model Compression and Classification of Tabular Data

no code yet • 17 May 2022

Unlike dropout learning, the proposed weight perturbation routine additionally achieves 15% to 40% sparsity across six tabular data sets for the compression of deep pretrained models.

Paper
Add Code

Guidelines for the Regularization of Gammas in Batch Normalization for Deep Residual Networks

no code yet • 15 May 2022

L2 regularization for weights in neural networks is widely used as a standard training trick.

Paper
Add Code

A Note on the Regularity of Images Generated by Convolutional Neural Networks

no code yet • 22 Apr 2022

The regularity of images generated by convolutional neural networks, such as the U-net, generative networks, or the deep image prior, is analyzed.

Paper
Add Code

A Closer Look at Rehearsal-Free Continual Learning

no code yet • 31 Mar 2022

Next, we explore how to leverage knowledge from a pre-trained model in rehearsal-free continual learning and find that vanilla L2 parameter regularization outperforms EWC parameter regularization and feature distillation.

Paper
Add Code

Probabilistic fine-tuning of pruning masks and PAC-Bayes self-bounded learning

no code yet • 22 Oct 2021

In the linear model, we show that a PAC-Bayes generalization error bound is controlled by the magnitude of the change in feature alignment between the 'prior' and 'posterior' data.

Paper
Add Code

Regularized Training of Nearest Neighbor Language Models

no code yet • NAACL (ACL) 2022

In particular, we find that the added L2 regularization seems to improve the performance for high-frequency words without deteriorating the performance for low frequency ones.

Paper
Add Code

Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization Training, Symmetry, and Sparsity

no code yet • 30 Jun 2021

The dynamics of Deep Linear Networks (DLNs) is dramatically affected by the variance $\sigma^2$ of the parameters at initialization $\theta_0$.

Paper
Add Code

Guiding Teacher Forcing with Seer Forcing for Neural Machine Translation

no code yet • ACL 2021

Meanwhile, we force the conventional decoder to simulate the behaviors of the seer decoder via knowledge distillation.

Paper
Add Code

L2 Regularization

Benchmarks Add a Result

Latest papers with no code

Content

Benchmarks

Add a Result