L2 Regularization

28 papers with code • 0 benchmarks • 0 datasets

See Weight Decay.

$L_{2}$ Regularization or Weight Decay, is a regularization technique applied to the weights of a neural network. We minimize a loss function compromising both the primary loss function and a penalty on the $L_{2}$ Norm of the weights:

$$L_{new}\left(w\right) = L_{original}\left(w\right) + \lambda{w^{T}w}$$

where $\lambda$ is a value determining the strength of the penalty (encouraging smaller weights).

Weight decay can be incorporated directly into the weight update rule, rather than just implicitly by defining it through to objective function. Often weight decay refers to the implementation where we specify it directly in the weight update rule (whereas L2 regularization is usually the implementation which is specified in the objective function).

Most implemented papers

Re-evaluating Continual Learning Scenarios: A Categorization and Case for Strong Baselines

GT-RIPL/Continual-Learning-Benchmark 30 Oct 2018

Continual learning has received a great deal of attention recently with several approaches being proposed.

Convolutional Neural Networks for Facial Expression Recognition

mgeezzyy/Facial-Expression-Recognition-2018 22 Apr 2017

We have developed convolutional neural networks (CNN) for a facial expression recognition task.

Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks

epfml/req 26 May 2023

This study investigates how weight decay affects the update behavior of individual neurons in deep neural networks through a combination of applied analysis and experimentation.

The Transient Nature of Emergent In-Context Learning in Transformers

aadityasingh/icl-transience NeurIPS 2023

The transient nature of ICL is observed in transformers across a range of model sizes and datasets, raising the question of how much to "overtrain" transformers when seeking compact, cheaper-to-run models.

On Regularization Parameter Estimation under Covariate Shift

wmkouw/covshift-l2reg 31 Jul 2016

This paper identifies a problem with the usual procedure for L2-regularization parameter estimation in a domain adaptation setting.

Neurogenesis-Inspired Dictionary Learning: Online Model Adaption in a Changing World

sgarg87/neurogenesis_inspired_dictionary_learning 22 Jan 2017

In this paper, we focus on online representation learning in non-stationary environments which may require continuous adaptation of model architecture.

Collaboratively Weighting Deep and Classic Representation via L2 Regularization for Image Classification

zengsn/research 21 Feb 2018

We propose a deep collaborative weight-based classification (DeepCWC) method to resolve this problem, by providing a novel option to fully take advantage of deep features in classic machine learning.

Quantifying Generalization in Reinforcement Learning

openai/coinrun 6 Dec 2018

In this paper, we investigate the problem of overfitting in deep reinforcement learning.

What is the Effect of Importance Weighting in Deep Learning?

hajohajo/TrainingFrameworkTrackDNN 8 Dec 2018

Importance-weighted risk minimization is a key ingredient in many machine learning algorithms for causal inference, domain adaptation, class imbalance, and off-policy reinforcement learning.

Learning a smooth kernel regularizer for convolutional neural networks

rfeinman/SK-regularization 5 Mar 2019

We propose a smooth kernel regularizer that encourages spatial correlations in convolution kernel weights.