L2 Regularization

28 papers with code • 0 benchmarks • 0 datasets

$L_{2}$ Regularization or Weight Decay, is a regularization technique applied to the weights of a neural network. We minimize a loss function compromising both the primary loss function and a penalty on the $L_{2}$ Norm of the weights:

$$L_{new}\left(w\right) = L_{original}\left(w\right) + \lambda{w^{T}w}$$

where $\lambda$ is a value determining the strength of the penalty (encouraging smaller weights).

Weight decay can be incorporated directly into the weight update rule, rather than just implicitly by defining it through to objective function. Often weight decay refers to the implementation where we specify it directly in the weight update rule (whereas L2 regularization is usually the implementation which is specified in the objective function).

Benchmarks

Add a Result

These leaderboards are used to track progress in L2 Regularization

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Most implemented papers

Most implemented Social Latest No code

Re-evaluating Continual Learning Scenarios: A Categorization and Case for Strong Baselines

GT-RIPL/Continual-Learning-Benchmark • • 30 Oct 2018

Continual learning has received a great deal of attention recently with several approaches being proposed.

Paper
Code

Convolutional Neural Networks for Facial Expression Recognition

mgeezzyy/Facial-Expression-Recognition-2018 • 22 Apr 2017

We have developed convolutional neural networks (CNN) for a facial expression recognition task.

Paper
Code

Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks

epfml/req • • 26 May 2023

This study investigates how weight decay affects the update behavior of individual neurons in deep neural networks through a combination of applied analysis and experimentation.

Paper
Code

The Transient Nature of Emergent In-Context Learning in Transformers

aadityasingh/icl-transience • • NeurIPS 2023

The transient nature of ICL is observed in transformers across a range of model sizes and datasets, raising the question of how much to "overtrain" transformers when seeking compact, cheaper-to-run models.

Paper
Code

On Regularization Parameter Estimation under Covariate Shift

wmkouw/covshift-l2reg • 31 Jul 2016

This paper identifies a problem with the usual procedure for L2-regularization parameter estimation in a domain adaptation setting.

Paper
Code

Neurogenesis-Inspired Dictionary Learning: Online Model Adaption in a Changing World

sgarg87/neurogenesis_inspired_dictionary_learning • 22 Jan 2017

In this paper, we focus on online representation learning in non-stationary environments which may require continuous adaptation of model architecture.

Paper
Code

Collaboratively Weighting Deep and Classic Representation via L2 Regularization for Image Classification

zengsn/research • 21 Feb 2018

We propose a deep collaborative weight-based classification (DeepCWC) method to resolve this problem, by providing a novel option to fully take advantage of deep features in classic machine learning.

Paper
Code

Quantifying Generalization in Reinforcement Learning

openai/coinrun • • 6 Dec 2018

In this paper, we investigate the problem of overfitting in deep reinforcement learning.

Paper
Code

What is the Effect of Importance Weighting in Deep Learning?

hajohajo/TrainingFrameworkTrackDNN • • 8 Dec 2018

Importance-weighted risk minimization is a key ingredient in many machine learning algorithms for causal inference, domain adaptation, class imbalance, and off-policy reinforcement learning.

Paper
Code

Learning a smooth kernel regularizer for convolutional neural networks

rfeinman/SK-regularization • • 5 Mar 2019

We propose a smooth kernel regularizer that encourages spatial correlations in convolution kernel weights.

Paper
Code

L2 Regularization

Benchmarks Add a Result

Most implemented papers

Content

Benchmarks

Add a Result