Marginal Likelihood Gradient for Bayesian Neural Networks
Bayesian learning of neural networks is attractive as it can protecting against over-fitting and provide automatic methods for inferring important hyperparameters by maximizing the marginal probability of the data. However, existing approaches in this vein, such as those based on variational inference, do not perform well. In this paper, we take a different approach and directly derive a practical estimator of the gradient of the marginal log-likelihood for BNNs by combining local reparametrization of the network w.r.t.~the prior distribution with the self-normalized importance sampling estimator. We show promising preliminary results on a toy example and on vectorized MNIST classification where the new method results in significantly improved performance of variational inference compared to existing approaches to tune hyperparameters.
PDF Abstract