Generative Models

Viewmaker Network is a type of generative model that learns to produce input-dependent views for contrastive learning. This network is trained jointly with an encoder network. The viewmaker network is trained adversarially to create views which increase the contrastive loss of the encoder network. Rather than directly outputting views for an image, the viewmaker instead outputs a stochastic perturbation that is added to the input. This perturbation is projected onto an $\mathcal{l}_{p}$ sphere, controlling the effective strength of the view, similar to methods in adversarial robustness. This constrained adversarial training method enables the model to reduce the mutual information between different views while preserving useful input features for the encoder to learn from.

Specifically, the encoder and viewmaker are optimized in alternating steps to minimize and maximize $\mathcal{L}$, respectively. An image-to-image neural network is used as the viewmaker network, with an architecture adapted from work on style transfer. This network ingests the input image and outputs a perturbation that is constrained to an $\ell_{1}$ sphere. The sphere's radius is determined by the volume of the input tensor times a hyperparameter $\epsilon$, the distortion budget, which determines the strength of the applied perturbation. This perturbation is added to the input image and optionally clamped in the case of images to ensure all pixels are in $[0,1]$.

Source: Viewmaker Networks: Learning Views for Unsupervised Representation Learning

Papers


Paper Code Results Date Stars

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories