Stochastic Optimization

Stochastic Weight Averaging

Introduced by Izmailov et al. in Averaging Weights Leads to Wider Optima and Better Generalization

Stochastic Weight Averaging is an optimization procedure that averages multiple points along the trajectory of SGD, with a cyclical or constant learning rate. On the one hand it averages weights, but it also has the property that, with a cyclical or constant learning rate, SGD proposals are approximately sampling from the loss surface of the network, leading to stochastic weights and helping to discover broader optima.

Source: Averaging Weights Leads to Wider Optima and Better Generalization


Paper Code Results Date Stars


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign
