Conditional Batch Normalization

Introduced by Vries et al. in Modulating early visual processing by language

Conditional Batch Normalization (CBN) is a class-conditional variant of batch normalization. The key idea is to predict the $\gamma$ and $\beta$ of the batch normalization from an embedding - e.g. a language embedding in VQA. CBN enables the linguistic embedding to manipulate entire feature maps by scaling them up or down, negating them, or shutting them off. CBN has also been used in GANs to allow class information to affect the batch normalization parameters.

Consider a single convolutional layer with batch normalization module $\text{BN}\left(F_{i,c,h,w}|\gamma_{c}, \beta_{c}\right)$ for which pretrained scalars $\gamma_{c}$ and $\beta_{c}$ are available. We would like to directly predict these affine scaling parameters from, e.g., a language embedding $\mathbf{e_{q}}$. When starting the training procedure, these parameters must be close to the pretrained values to recover the original ResNet model as a poor initialization could significantly deteriorate performance. Unfortunately, it is difficult to initialize a network to output the pretrained $\gamma$ and $\beta$. For these reasons, the authors propose to predict a change $\delta\beta_{c}$ and $\delta\gamma_{c}$ on the frozen original scalars, for which it is straightforward to initialize a neural network to produce an output with zero-mean and small variance.

The authors use a one-hidden-layer MLP to predict these deltas from a question embedding $\mathbf{e_{q}}$ for all feature maps within the layer:

$$\Delta\beta = \text{MLP}\left(\mathbf{e_{q}}\right)$$

$$\Delta\gamma = \text{MLP}\left(\mathbf{e_{q}}\right)$$

So, given a feature map with $C$ channels, these MLPs output a vector of size $C$. We then add these predictions to the $\beta$ and $\gamma$ parameters:

$$ \hat{\beta}_{c} = \beta_{c} + \Delta\beta_{c} $$

$$ \hat{\gamma}_{c} = \gamma_{c} + \Delta\gamma_{c} $$

Finally, these updated $\hat{β}$ and $\hat{\gamma}$ are used as parameters for the batch normalization: $\text{BN}\left(F_{i,c,h,w}|\hat{\gamma_{c}}, \hat{\beta_{c}}\right)$. The authors freeze all ResNet parameters, including $\gamma$ and $\beta$, during training. A ResNet consists of four stages of computation, each subdivided in several residual blocks. In each block, the authors apply CBN to the three convolutional layers.

Source: Modulating early visual processing by language

Latest Papers

PAPER DATE
not-so-BigGAN: Generating High-Fidelity Images on a Small Compute Budget
Seungwook HanAkash SrivastavaCole HurwitzPrasanna SattigeriDavid D. Cox
2020-09-09
Neural Crossbreed: Neural Based Image Metamorphosis
Sanghun ParkKwanggyoon SeoJunyong Noh
2020-09-02
A Spectral Energy Distance for Parallel Speech Synthesis
Alexey A. GritsenkoTim SalimansRianne van den BergJasper SnoekNal Kalchbrenner
2020-08-03
Instance Selection for GANs
Terrance DeVriesMichal DrozdzalGraham W. Taylor
2020-07-30
Interpolating GANs to Scaffold Autotelic Creativity
Ziv EpsteinOcéane BoulaisSkylar GordonMatt Groh
2020-07-21
Differentiable Augmentation for Data-Efficient GAN Training
| Shengyu ZhaoZhijian LiuJi LinJun-Yan ZhuSong Han
2020-06-18
Training Generative Adversarial Networks with Limited Data
| Tero KarrasMiika AittalaJanne HellstenSamuli LaineJaakko LehtinenTimo Aila
2020-06-11
Learning disconnected manifolds: a no GANs land
Ugo TanielianThibaut IssenhuthElvis DohmatobJeremie Mary
2020-06-08
Big GANs Are Watching You: Towards Unsupervised Object Segmentation with Off-the-Shelf Generative Models
| Andrey VoynovStanislav MorozovArtem Babenko
2020-06-08
A U-Net Based Discriminator for Generative Adversarial Networks
Edgar Schonfeld Bernt Schiele Anna Khoreva
2020-06-01
Network Fusion for Content Creation with Conditional INNs
Robin RombachPatrick EsserBjörn Ommer
2020-05-27
GANSpace: Discovering Interpretable GAN Controls
| Erik HärkönenAaron HertzmannJaakko LehtinenSylvain Paris
2020-04-06
Evolving Normalization-Activation Layers
| Hanxiao LiuAndrew BrockKaren SimonyanQuoc V. Le
2020-04-06
Feature Quantization Improves GAN Training
| Yang ZhaoChunyuan LiPing YuJianfeng GaoChangyou Chen
2020-04-05
Exemplar Normalization for Learning Deep Representation
Ruimao ZhangZhanglin PengLingyun WuZhen LiPing Luo
2020-03-19
BigGAN-based Bayesian reconstruction of natural images from human brain activity
Kai QiaoJian ChenLinyuan WangChi ZhangLi TongBin Yan
2020-03-13
Transformation-based Adversarial Video Prediction on Large-Scale Data
Pauline LucAidan ClarkSander DielemanDiego de Las CasasYotam DoronAlbin CassirerKaren Simonyan
2020-03-09
A U-Net Based Discriminator for Generative Adversarial Networks
| Edgar SchönfeldBernt SchieleAnna Khoreva
2020-02-28
Improved Consistency Regularization for GANs
Zhengli ZhaoSameer SinghHonglak LeeZizhao ZhangAugustus OdenaHan Zhang
2020-02-11
Reconstructing Natural Scenes from fMRI Patterns using BigBiGAN
Milad MozafariLeila ReddyRufin VanRullen
2020-01-31
Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian Mixtures
Mohamed El Amine SeddikCosme LouartMohamed TamaazoustiRomain Couillet
2020-01-21
CNN-generated images are surprisingly easy to spot... for now
| Sheng-Yu WangOliver WangRichard ZhangAndrew OwensAlexei A. Efros
2019-12-23
LOGAN: Latent Optimisation for Generative Adversarial Networks
| Yan WuJeff DonahueDavid BalduzziKaren SimonyanTimothy Lillicrap
2019-12-02
Detecting GAN generated errors
Xiru ZhuFengdi CheTianzi YangTzuyang YuDavid MegerGregory Dudek
2019-12-02
Semantic Hierarchy Emerges in Deep Generative Representations for Scene Synthesis
| Ceyuan YangYujun ShenBolei Zhou
2019-11-21
Improving sample diversity of a pre-trained, class-conditional GAN by changing its class embeddings
| Qi LiLong MaiMichael A. AlcornAnh Nguyen
2019-10-10
High Fidelity Speech Synthesis with Adversarial Networks
| Mikołaj BińkowskiJeff DonahueSander DielemanAidan ClarkErich ElsenNorman CasagrandeLuis C. CoboKaren Simonyan
2019-09-25
Adversarial Video Generation on Complex Datasets
Aidan ClarkJeff DonahueKaren Simonyan
2019-07-15
Large Scale Adversarial Representation Learning
| Jeff DonahueKaren Simonyan
2019-07-04
Whitening and Coloring transform for GANs
Aliaksandr SiarohinEnver SanginetoNicu Sebe
2019-05-01
Improved Precision and Recall Metric for Assessing Generative Models
| Tuomas KynkäänniemiTero KarrasSamuli LaineJaakko LehtinenTimo Aila
2019-04-15
High-Fidelity Image Generation With Fewer Labels
| Mario LucicMichael TschannenMarvin RitterXiaohua ZhaiOlivier BachemSylvain Gelly
2019-03-06
Spatially Controllable Image Synthesis with Internal Representation Collaging
| Ryohei SuzukiMasanori KoyamaTakeru MiyatoTaizan YonetsujiHuachun Zhu
2018-11-26
Metropolis-Hastings view on variational inference and adversarial training
Kirill NeklyudovEvgenii EgorovPavel ShvechikovDmitry Vetrov
2018-10-16
Large Scale GAN Training for High Fidelity Natural Image Synthesis
| Andrew BrockJeff DonahueKaren Simonyan
2018-09-28
Whitening and Coloring batch transform for GANs
| Aliaksandr SiarohinEnver SanginetoNicu Sebe
2018-06-01
Learning Visual Reasoning Without Strong Priors
| Ethan PerezHarm de VriesFlorian StrubVincent DumoulinAaron Courville
2017-07-10
Modulating early visual processing by language
| Harm de VriesFlorian StrubJérémie MaryHugo LarochelleOlivier PietquinAaron Courville
2017-07-02

Categories