Beta-VAE

Introduced by Higgins et al. in beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework

Beta-VAE is a type of variational autoencoder that seeks to discover disentangled latent factors. It modifies VAEs with an adjustable hyperparameter $\beta$ that balances latent channel capacity and independence constraints with reconstruction accuracy. The idea is to maximize the probability of generating the real data while keeping the distance between the real and estimated distributions small, under a threshold $\epsilon$. We can use the Kuhn-Tucker conditions to write this as a single equation:

$$ \mathcal{F}\left(\theta, \phi, \beta; \mathbf{x}, \mathbf{z}\right) = \mathbb{E}_{q_{\phi}\left(\mathbf{z}|\mathbf{x}\right)}\left[\log{p}_{\theta}\left(\mathbf{x}\mid\mathbf{z}\right)\right] - \beta\left[D_{KL}\left(\log{q}_{\theta}\left(\mathbf{z}\mid\mathbf{x}\right)||p\left(\mathbf{z}\right)\right) - \epsilon\right]$$

where the KKT multiplier $\beta$ is the regularization coefficient that constrains the capacity of the latent channel $\mathbf{z}$ and puts implicit independence pressure on the learnt posterior due to the isotropic nature of the Gaussian prior $p\left(\mathbf{z}\right)$.

We write this again using the complementary slackness assumption to get the Beta-VAE formulation:

$$ \mathcal{F}\left(\theta, \phi, \beta; \mathbf{x}, \mathbf{z}\right) \geq \mathcal{L}\left(\theta, \phi, \beta; \mathbf{x}, \mathbf{z}\right) = \mathbb{E}_{q_{\phi}\left(\mathbf{z}|\mathbf{x}\right)}\left[\log{p}_{\theta}\left(\mathbf{x}\mid\mathbf{z}\right)\right] - \beta{D}_{KL}\left(\log{q}_{\theta}\left(\mathbf{z}\mid\mathbf{x}\right)||p\left(\mathbf{z}\right)\right)$$

Source: beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Disentanglement	12	46.15%
Graph Generation	2	7.69%
Data Compression	2	7.69%
Clustering	2	7.69%
Model Compression	1	3.85%
Finger Vein Recognition	1	3.85%
Recommendation Systems	1	3.85%
BIG-bench Machine Learning	1	3.85%
Anomaly Detection	1	3.85%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
🤖 No Components Found	You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories

Add Remove

Generative Models

Likelihood-Based Generative Models