InfoGAN is a type of generative adversarial network that modifies the GAN objective to encourage it to learn interpretable and meaningful representations. This is done by maximizing the mutual information between a fixed small subset of the GAN’s noise variables and the observations.
Formally, InfoGAN is defined as a minimax game with a variational regularization of mutual information and the hyperparameter $\lambda$:
$$ \min_{G, Q}\max_{D}V_{INFOGAN}\left(D, G, Q\right) = V\left(D, G\right) - \lambda{L}_{I}\left(G, Q\right) $$
Where $Q$ is an auxiliary distribution that approximates the posterior $P\left(c\mid{x}\right)$ - the probability of the latent code $c$ given the data $x$ - and $L_{I}$ is the variational lower bound of the mutual information between the latent code and the observations.
In the practical implementation, there is another fully-connected layer to output parameters for the conditional distribution $Q$ (negligible computation ontop of regular GAN structures). Q is represented with a softmax non-linearity for a categorical latent code. For a continuous latent code, the authors assume a factored Gaussian.
Source: InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial NetsTASK | PAPERS | SHARE |
---|---|---|
Image Generation | 2 | 15.38% |
Active Learning | 1 | 7.69% |
Decision Making | 1 | 7.69% |
Speech Synthesis | 1 | 7.69% |
Visual Tracking | 1 | 7.69% |
Human motion prediction | 1 | 7.69% |
Self-Driving Cars | 1 | 7.69% |
Trajectory Forecasting | 1 | 7.69% |
Trajectory Prediction | 1 | 7.69% |
COMPONENT | TYPE |
|
---|---|---|
![]() |
Feedforward Networks | |
![]() |
Activation Functions | |
![]() |
Activation Functions | |
![]() |
Output Functions |