With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling.
We explore the use of Vector Quantized Variational AutoEncoder (VQ-VAE) models for large scale image generation.
We further confirm the flexibility of our model by showing a Levenshtein Transformer trained by machine translation can straightforwardly be used for automatic post-editing.
We demonstrate that scaling networks with CondConv improves the performance and inference cost trade-off of several existing convolutional neural network architectures on both classification and detection tasks.
#62 best model for Image Classification on ImageNet
The vast majority of successful deep neural networks are trained using variants of stochastic gradient descent (SGD) algorithms.
Scaling up deep neural network capacity has been known as an effective approach to improving model quality for several different machine learning tasks.
SOTA for Image Classification on CIFAR-10 (using extra training data)
On unsupervised machine translation, we obtain 34. 3 BLEU on WMT'16 German-English, improving the previous state of the art by more than 9 BLEU.
To address the limitations, we propose a few-shot vid2vid framework, which learns to synthesize videos of previously unseen subjects or scenes by leveraging few example images of the target at test time.