Learning in models with discrete latent variables is challenging due to high variance gradient estimators.
Highly expressive directed latent variable models, such as sigmoid belief networks, are difficult to train on large datasets because exact inference in them is intractable and none of the approximate inference methods that have been applied to them scale well.
Ranked #1 on
Latent Variable Models
on 200k Short Texts for Humor Detection
(using extra training data)
When used as a surrogate objective for maximum likelihood estimation in latent variable models, the evidence lower bound (ELBO) produces state-of-the-art results.
Previous work on neural noisy channel modeling relied on latent variable models that incrementally process the source and target sentence.
In this paper, by comparing several density estimators on five machine translation tasks, we find that the correlation between rankings of models based on log-likelihood and BLEU varies significantly depending on the range of the model families being compared.
DENSITY ESTIMATION LATENT VARIABLE MODELS MACHINE TRANSLATION STRUCTURED PREDICTION
Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network.
Ranked #2 on
Multivariate Time Series Forecasting
on MuJoCo
LATENT VARIABLE MODELS MULTIVARIATE TIME SERIES FORECASTING MULTIVARIATE TIME SERIES IMPUTATION
We introduce stochastic variational inference for Gaussian process models.
GAUSSIAN PROCESSES LATENT VARIABLE MODELS VARIATIONAL INFERENCE
A neural network (NN) is a parameterised function that can be tuned via gradient descent to approximate a labelled collection of data with high precision.
We validate this framework on two very different text modelling applications, generative document modelling and supervised question answering.
Ranked #1 on
Question Answering
on QASent
ANSWER SELECTION LATENT VARIABLE MODELS TOPIC MODELS VARIATIONAL INFERENCE
This adversarially regularized autoencoder (ARAE) allows us to generate natural textual outputs as well as perform manipulations in the latent space to induce change in the output space.
LATENT VARIABLE MODELS REPRESENTATION LEARNING STYLE TRANSFER