The VampPrior Mixture Model

6 Feb 2024  ·  Andrew Stirn, David A. Knowles ·

Current clustering priors for deep latent variable models (DLVMs) require defining the number of clusters a-priori and are susceptible to poor initializations. Addressing these deficiencies could greatly benefit deep learning-based scRNA-seq analysis by performing integration and clustering simultaneously. We adapt the VampPrior (Tomczak & Welling, 2018) into a Dirichlet process Gaussian mixture model, resulting in the VampPrior Mixture Model (VMM), a novel prior for DLVMs. We propose an inference procedure that alternates between variational inference and Empirical Bayes to cleanly distinguish variational and prior parameters. Using the VMM in a Variational Autoencoder attains highly competitive clustering performance on benchmark datasets. Augmenting scVI (Lopez et al., 2018), a popular scRNA-seq integration method, with the VMM significantly improves its performance and automatically arranges cells into biologically meaningful clusters.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Image Clustering Fashion-MNIST VMM Accuracy 0.716 # 2
NMI 0.710 # 3
Unsupervised Image Classification MNIST VMM Accuracy 96.74 # 4
Image Clustering MNIST-full VMM NMI 0.920 # 12
Accuracy 0.967 # 11

Methods