Search Results for author: Aäron van den Oord

Found 15 papers, 7 papers with code

AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning

1 code implementation • 7 Aug 2023 • Michaël Mathieu, Sherjil Ozair, Srivatsan Srinivasan, Caglar Gulcehre, Shangtong Zhang, Ray Jiang, Tom Le Paine, Richard Powell, Konrad Żołna, Julian Schrittwieser, David Choi, Petko Georgiev, Daniel Toyama, Aja Huang, Roman Ring, Igor Babuschkin, Timo Ewalds, Mahyar Bordbar, Sarah Henderson, Sergio Gómez Colmenarejo, Aäron van den Oord, Wojciech Marian Czarnecki, Nando de Freitas, Oriol Vinyals

StarCraft II is one of the most challenging simulated reinforcement learning environments; it is partially observable, stochastic, multi-agent, and mastering StarCraft II requires strategic planning over long time horizons with real-time low-level execution.

Offline RL reinforcement-learning +2

341

Paper
Code

Vector Quantized Models for Planning

no code implementations • 8 Jun 2021 • Sherjil Ozair, Yazhe Li, Ali Razavi, Ioannis Antonoglou, Aäron van den Oord, Oriol Vinyals

Our key insight is to use discrete autoencoders to capture the multiple possible effects of an action in a stochastic environment.

Paper
Add Code

Broaden Your Views for Self-Supervised Video Learning

1 code implementation • ICCV 2021 • Adrià Recasens, Pauline Luc, Jean-Baptiste Alayrac, Luyu Wang, Ross Hemsley, Florian Strub, Corentin Tallec, Mateusz Malinowski, Viorica Patraucean, Florent Altché, Michal Valko, Jean-bastien Grill, Aäron van den Oord, Andrew Zisserman

Most successful self-supervised learning methods are trained to align the representations of two independent views from the data.

Ranked #1 on Self-Supervised Action Recognition on HMDB51 (finetuned)

Audio Classification Optical Flow Estimation +4

Paper
Code

Predicting Video with VQVAE

1 code implementation • 2 Mar 2021 • Jacob Walker, Ali Razavi, Aäron van den Oord

In recent years, the task of video prediction-forecasting future video given past video frames-has attracted attention in the research community.

Ranked #10 on Video Prediction on Kinetics-600 12 frames, 64x64

Video Generation Video Prediction

Paper
Code

Are we done with ImageNet?

2 code implementations • 12 Jun 2020 • Lucas Beyer, Olivier J. Hénaff, Alexander Kolesnikov, Xiaohua Zhai, Aäron van den Oord

Yes, and no.

180

Paper
Code

Low Bit-Rate Speech Coding with VQ-VAE and a WaveNet Decoder

no code implementations • 14 Oct 2019 • Cristina Gârbacea, Aäron van den Oord, Yazhe Li, Felicia S. C. Lim, Alejandro Luebs, Oriol Vinyals, Thomas C. Walters

In order to efficiently transmit and store speech signals, speech codecs create a minimally redundant representation of the input signal which is then decoded at the receiver with the best possible perceptual quality.

Paper
Add Code

Visual Imitation with a Minimal Adversary

no code implementations • ICLR 2019 • Scott Reed, Yusuf Aytar, Ziyu Wang, Tom Paine, Aäron van den Oord, Tobias Pfaff, Sergio Gomez, Alexander Novikov, David Budden, Oriol Vinyals

The proposed agent can solve a challenging robot manipulation task of block stacking from only video demonstrations and sparse reward, in which the non-imitating agents fail to learn completely.

Imitation Learning Robot Manipulation

Paper
Add Code

Unsupervised speech representation learning using WaveNet autoencoders

5 code implementations • 25 Jan 2019 • Jan Chorowski, Ron J. Weiss, Samy Bengio, Aäron van den Oord

We consider the task of unsupervised extraction of meaningful latent representations of speech by applying autoencoding neural networks to speech waveforms.

Acoustic Unit Discovery Dimensionality Reduction +1

308

Paper
Code

Preventing Posterior Collapse with delta-VAEs

no code implementations • ICLR 2019 • Ali Razavi, Aäron van den Oord, Ben Poole, Oriol Vinyals

Due to the phenomenon of "posterior collapse," current latent variable generative models pose a challenging design choice that either weakens the capacity of the decoder or requires augmenting the objective so it does not only maximize the likelihood of the data.

Ranked #7 on Image Generation on ImageNet 32x32 (bpd metric)

Image Generation Representation Learning

Paper
Add Code

Sample Efficient Adaptive Text-to-Speech

no code implementations • ICLR 2019 • Yutian Chen, Yannis Assael, Brendan Shillingford, David Budden, Scott Reed, Heiga Zen, Quan Wang, Luis C. Cobo, Andrew Trask, Ben Laurie, Caglar Gulcehre, Aäron van den Oord, Oriol Vinyals, Nando de Freitas

Instead, the aim is to produce a network that requires few data at deployment time to rapidly adapt to new speakers.

Meta-Learning Voice Similarity

Paper
Add Code

The challenge of realistic music generation: modelling raw audio at scale

no code implementations • NeurIPS 2018 • Sander Dieleman, Aäron van den Oord, Karen Simonyan

It has been shown that autoregressive models excel at generating raw audio waveforms of speech, but when applied to music, we find them biased towards capturing local signal structure at the expense of modelling long-range correlations.

Music Generation

Paper
Add Code

Few-shot Autoregressive Density Estimation: Towards Learning to Learn Distributions

no code implementations • ICLR 2018 • Scott Reed, Yutian Chen, Thomas Paine, Aäron van den Oord, S. M. Ali Eslami, Danilo Rezende, Oriol Vinyals, Nando de Freitas

Deep autoregressive models have shown state-of-the-art performance in density estimation for natural images on large-scale datasets such as ImageNet.

Density Estimation Image Generation +1

Paper
Add Code

Parallel Multiscale Autoregressive Density Estimation

no code implementations • ICML 2017 • Scott Reed, Aäron van den Oord, Nal Kalchbrenner, Sergio Gómez Colmenarejo, Ziyu Wang, Dan Belov, Nando de Freitas

Our new PixelCNN model achieves competitive density estimation and orders of magnitude speedup - O(log N) sampling instead of O(N) - enabling the practical generation of 512x512 images.

Ranked #2 on Image Compression on ImageNet32

Conditional Image Generation Density Estimation +2

Paper
Add Code

A note on the evaluation of generative models

1 code implementation • 5 Nov 2015 • Lucas Theis, Aäron van den Oord, Matthias Bethge

In particular, we show that three of the currently most commonly used criteria---average log-likelihood, Parzen window estimates, and visual fidelity of samples---are largely independent of each other when the data is high-dimensional.

Denoising Texture Synthesis

Paper
Code

Beyond Temporal Pooling: Recurrence and Temporal Convolutions for Gesture Recognition in Video

1 code implementation • 5 Jun 2015 • Lionel Pigou, Aäron van den Oord, Sander Dieleman, Mieke Van Herreweghe, Joni Dambre

Recent studies have demonstrated the power of recurrent neural networks for machine translation, image captioning and speech recognition.

Ranked #1 on Gesture Recognition on Montalbano

Gesture Recognition Image Captioning +5

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.