Group Normalization

Introduced by Wu et al. in Group Normalization

Group Normalization is a normalization layer that divides channels into groups and normalizes the features within each group. GN does not exploit the batch dimension, and its computation is independent of batch sizes. In the case where the group size is 1, it is equivalent to Instance Normalization.

As motivation for the method, many classical features like SIFT and HOG had group-wise features and involved group-wise normalization. For example, a HOG vector is the outcome of several spatial cells where each cell is represented by a normalized orientation histogram.

Formally, Group Normalization is defined as:

$$\mu_{i} = \frac{1}{m}\sum_{k\in\mathcal{S}_{i}}x_{k}$$

$$\sigma^{2}_{i} = \frac{1}{m}\sum_{k\in\mathcal{S}_{i}}\left(x_{k}-\mu_{i}\right)^{2}$$

$$\hat{x}_{i} = \frac{x_{i} - \mu_{i}}{\sqrt{\sigma^{2}_{i}+\epsilon}}$$

Here $x$ is the feature computed by a layer, and $i$ is an index. Formally, a Group Norm layer computes $\mu$ and $\sigma$ in a set $\mathcal{S}_{i}$ defined as: $\mathcal{S}_{i} =${$k \mid k_{N} = i_{N} ,\lfloor\frac{k_{C}}{C/G}\rfloor = \lfloor\frac{I_{C}}{C/G}\rfloor$}.

Here $G$ is the number of groups, which is a pre-defined hyper-parameter ($G = 32$ by default). $C/G$ is the number of channels per group. $\lfloor$ is the floor operation, and the final term means that the indexes $i$ and $k$ are in the same group of channels, assuming each group of channels are stored in a sequential order along the $C$ axis.

Source: Group Normalization

Latest Papers

PAPER DATE
Batch Group Normalization
Xiao-Yun ZhouJiacheng SunNanyang YeXu LanQijun LuoBo-Lin LaiPedro EsperancaGuang-Zhong YangZhenguo Li
2020-12-04
High resolution weakly supervised localization architectures for medical images
| Konpat PreechakulSira SriswasdiBoonserm KijsirikulEkapol Chuangsuwanich
2020-10-22
BYOL works even without batch statistics
| Pierre H. RichemondJean-bastien GrillFlorent AltchéCorentin TallecFlorian StrubAndrew BrockSamuel SmithSoham DeRazvan PascanuBilal PiotMichal Valko
2020-10-20
Group Whitening: Balancing Learning Efficiency and Representational Capacity
| Lei HuangYi ZhouLi LiuFan ZhuLing Shao
2020-09-28
New Interpretations of Normalization Methods in Deep Learning
Jiacheng SunXiangyong CaoHanwen LiangWeiran HuangZewei ChenZhenguo Li
2020-06-16
Towards Deeper Graph Neural Networks with Differentiable Group Normalization
| Kaixiong ZhouXiao HuangYuening LiDaochen ZhaRui ChenXia Hu
2020-06-12
Effective Data Fusion with Generalized Vegetation Index: Evidence from Land Cover Segmentation in Agriculture
Hao ShengXiao ChenJingyi SuRam RajagopalAndrew Ng
2020-05-07
Evolving Normalization-Activation Layers
| Hanxiao LiuAndrew BrockKaren SimonyanQuoc V. Le
2020-04-06
Extended Batch Normalization
Chunjie LuoJianfeng ZhanLei WangWanling Gao
2020-03-12
Self-Supervised Spatio-Temporal Representation Learning Using Variable Playback Speed Prediction
Hyeon ChoTae-hoon KimHyung Jin ChangWonjun Hwang
2020-03-05
Big Transfer (BiT): General Visual Representation Learning
| Alexander KolesnikovLucas BeyerXiaohua ZhaiJoan PuigcerverJessica YungSylvain GellyNeil Houlsby
2019-12-24
PointRend: Image Segmentation as Rendering
| Alexander KirillovYuxin WuKaiming HeRoss Girshick
2019-12-17
Local Context Normalization: Revisiting Local Normalization
| Anthony OrtizCaleb RobinsonDan MorrisOlac FuentesChristopher KiekintveldMd Mahmudulla HassanNebojsa Jojic
2019-12-12
Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection
| Shifeng ZhangCheng ChiYongqiang YaoZhen LeiStan Z. Li
2019-12-05
An Exponential Learning Rate Schedule for Deep Learning
Zhiyuan LiSanjeev Arora
2019-10-16
The Non-IID Data Quagmire of Decentralized Machine Learning
| Kevin HsiehAmar PhanishayeeOnur MutluPhillip B. Gibbons
2019-10-01
An Empirical Study of Batch Normalization and Group Normalization in Conditional Computation
Vincent MichalskiVikram VoletiSamira Ebrahimi KahouAnthony OrtizPascal VincentChris PalDoina Precup
2019-07-31
Switchable Normalization for Learning-to-Normalize Deep Representation
Ping LuoRuimao ZhangJiamin RenZhanglin PengJingyu Li
2019-07-22
IMP: Instance Mask Projection for High Accuracy Semantic Segmentation of Things
Cheng-Yang FuTamara L. BergAlexander C. Berg
2019-06-15
Four Things Everyone Should Know to Improve Batch Normalization
| Cecilia SummersMichael J. Dinneen
2019-06-09
Online Normalization for Training Neural Networks
| Vitaliy ChileyIlya SharapovAtli KossonUrs KosterRyan ReeceSofia Samaniego de la FuenteVishal SubbiahMichael James
2019-05-15
Instance-Level Meta Normalization
2019-04-06
| Cheng-Yang FuMykhailo ShvetsAlexander C. Berg
2019-01-10
Panoptic Feature Pyramid Networks
| Alexander KirillovRoss GirshickKaiming HePiotr Dollár
2019-01-08
Kalman Normalization: Normalizing Internal Representations Across Network Layers
Guangrun WangJiefeng PengPing LuoXinjiang WangLiang Lin
2018-12-01
Normalization in Training U-Net for 2D Biomedical Semantic Segmentation
Xiao-Yun ZhouGuang-Zhong Yang
2018-09-11
Differentiable Learning-to-Normalize via Switchable Normalization
| Ping LuoJiamin RenZhanglin PengRuimao ZhangJingyu Li
2018-06-28
Group Normalization
| Yuxin WuKaiming He
2018-03-22
Cascade R-CNN: Delving into High Quality Object Detection
| Zhaowei CaiNuno Vasconcelos
2017-12-03

Components

COMPONENT TYPE
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign