12 code implementations • ICLR 2021 • Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamas Sarlos, Peter Hawkins, Jared Davis, Afroz Mohiuddin, Lukasz Kaiser, David Belanger, Lucy Colwell, Adrian Weller
We introduce Performers, Transformer architectures which can estimate regular (softmax) full-rank-attention Transformers with provable accuracy, but using only linear (as opposed to quadratic) space and time complexity, without relying on any priors such as sparsity or low-rankness.
Ranked #7 on Offline RL on D4RL
1 code implementation • 5 Jun 2020 • Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamas Sarlos, Peter Hawkins, Jared Davis, David Belanger, Lucy Colwell, Adrian Weller
In response, solutions that exploit the structure and sparsity of the learned attention matrix have blossomed.
no code implementations • ICML 2020 • Christof Angermueller, David Belanger, Andreea Gane, Zelda Mariet, David Dohan, Kevin Murphy, Lucy Colwell, D. Sculley
The cost and latency of wet-lab experiments requires methods that find good sequences in few experimental rounds of large batches of sequences--a setting that off-the-shelf black-box optimization methods are ill-equipped to handle.
1 code implementation • NeurIPS 2019 • Guy Lorberbom, Tommi Jaakkola, Andreea Gane, Tamir Hazan
Reparameterization of variational auto-encoders with continuous random variables is an effective method for reducing the variance of their gradient estimates.
1 code implementation • 24 Jul 2018 • Luke B. Hewitt, Maxwell I. Nye, Andreea Gane, Tommi Jaakkola, Joshua B. Tenenbaum
However, when this generative model is expressed as a powerful neural network such as a PixelCNN, we show that existing learning techniques typically fail to effectively use latent variables.
2 code implementations • ICLR 2019 • Guy Lorberbom, Andreea Gane, Tommi Jaakkola, Tamir Hazan
We demonstrate empirically the effectiveness of the direct loss minimization technique in variational autoencoders with both unstructured and structured discrete latent variables.
1 code implementation • 21 Nov 2015 • Jesse Dodge, Andreea Gane, Xiang Zhang, Antoine Bordes, Sumit Chopra, Alexander Miller, Arthur Szlam, Jason Weston
A long-term goal of machine learning is to build intelligent conversational agents.