At the same time, advances in approximate Bayesian methods have made posterior approximation for flexible neural network models practical.
SOTA for Multi-Armed Bandits on Mushroom
We show that this form of adversarial training converges to a degenerate global minimum, wherein small curvature artifacts near the data points obfuscate a linear approximation of the loss.
In this work, we study how depthwise separable convolutions can be applied to neural machine translation.
#14 best model for Machine Translation on WMT2014 English-German
We propose to improve the representation in sequence models by augmenting current approaches with an autoencoder that is forced to compress the sequence through an intermediate discrete latent space.
We finally describe experiments on the English-Esperanto low-resource language pair, on which there only exists a limited amount of parallel data, to show the potential impact of our method in fully unsupervised machine translation.
SOTA for Word Alignment on en-es