Search Results for author: Yee Whye Teh

Found 139 papers, 65 papers with code

Simple and Scalable Epistemic Uncertainty Estimation Using a Single Deep Deterministic Neural Network

no code implementations ICML 2020 Joost van Amersfoort, Lewis Smith, Yee Whye Teh, Yarin Gal

We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass.

Uncertainty Quantification

Unleashing the Power of Meta-tuning for Few-shot Generalization Through Sparse Interpolated Experts

1 code implementation13 Mar 2024 Shengzhuang Chen, Jihoon Tack, Yunqiao Yang, Yee Whye Teh, Jonathan Richard Schwarz, Ying WEI

Conventional wisdom suggests parameter-efficient fine-tuning of foundation models as the state-of-the-art method for transfer learning in vision, replacing the rich literature of alternatives such as meta-learning.

Domain Generalization Few-Shot Image Classification +2

Online Adaptation of Language Models with a Memory of Amortized Contexts

1 code implementation7 Mar 2024 Jihoon Tack, Jaehyung Kim, Eric Mitchell, Jinwoo Shin, Yee Whye Teh, Jonathan Richard Schwarz

We propose an amortized feature extraction and memory-augmentation approach to compress and extract information from new documents into compact modulations stored in a memory bank.

Language Modelling Meta-Learning

The Edge-of-Reach Problem in Offline Model-Based Reinforcement Learning

2 code implementations19 Feb 2024 Anya Sims, Cong Lu, Yee Whye Teh

The prevailing theoretical understanding is that this can then be viewed as online reinforcement learning in an approximate dynamics model, and any remaining gap is therefore assumed to be due to the imperfect dynamics model.

Model-based Reinforcement Learning reinforcement-learning

Tractable Function-Space Variational Inference in Bayesian Neural Networks

1 code implementation28 Dec 2023 Tim G. J. Rudner, Zonghao Chen, Yee Whye Teh, Yarin Gal

Recognizing that the primary object of interest in most settings is the distribution over functions induced by the posterior distribution over neural network parameters, we frame Bayesian inference in neural networks explicitly as inferring a posterior distribution over functions and propose a scalable function-space variational inference method that allows incorporating prior information and results in reliable predictive uncertainty estimates.

Bayesian Inference Medical Diagnosis +1

Continual Learning via Sequential Function-Space Variational Inference

no code implementations28 Dec 2023 Tim G. J. Rudner, Freddie Bickford Smith, Qixuan Feng, Yee Whye Teh, Yarin Gal

Sequential Bayesian inference over predictive functions is a natural framework for continual learning from streams of data.

Bayesian Inference Continual Learning +2

SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning

1 code implementation1 Aug 2023 Ning Miao, Yee Whye Teh, Tom Rainforth

The recent progress in large language models (LLMs), especially the invention of chain-of-thought prompting, has made it possible to automatically answer questions by stepwise reasoning.

GSM8K Math +1

Drug Discovery under Covariate Shift with Domain-Informed Prior Distributions over Functions

1 code implementation14 Jul 2023 Leo Klarner, Tim G. J. Rudner, Michael Reutlinger, Torsten Schindler, Garrett M. Morris, Charlotte Deane, Yee Whye Teh

Accelerating the discovery of novel and more effective therapeutics is an important pharmaceutical problem in which deep learning is playing an increasingly significant role.

Domain Adaptation Drug Discovery

Kalman Filter for Online Classification of Non-Stationary Data

no code implementations14 Jun 2023 Michalis K. Titsias, Alexandre Galashov, Amal Rannen-Triki, Razvan Pascanu, Yee Whye Teh, Jorg Bornschein

Non-stationarity over the linear predictor weights is modelled using a parameter drift transition density, parametrized by a coefficient that quantifies forgetting.

Classification Continual Learning +1

Synthetic Experience Replay

1 code implementation NeurIPS 2023 Cong Lu, Philip J. Ball, Yee Whye Teh, Jack Parker-Holder

We believe that synthetic training data could open the door to realizing the full potential of deep learning for replay-based RL algorithms from limited data.

Reinforcement Learning (RL) Self-Supervised Learning

Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation

no code implementations20 Feb 2023 Bobby He, James Martens, Guodong Zhang, Aleksandar Botev, Andrew Brock, Samuel L Smith, Yee Whye Teh

Skip connections and normalisation layers form two standard architectural components that are ubiquitous for the training of Deep Neural Networks (DNNs), but whose precise roles are poorly understood.

Modality-Agnostic Variational Compression of Implicit Neural Representations

no code implementations23 Jan 2023 Jonathan Richard Schwarz, Jihoon Tack, Yee Whye Teh, Jaeho Lee, Jinwoo Shin

We introduce a modality-agnostic neural compression algorithm based on a functional view of data and parameterised as an Implicit Neural Representation (INR).

Data Compression

On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations

1 code implementation NeurIPS 2021 Tim G. J. Rudner, Cong Lu, Michael A. Osborne, Yarin Gal, Yee Whye Teh

KL-regularized reinforcement learning from expert demonstrations has proved successful in improving the sample efficiency of deep reinforcement learning algorithms, allowing them to be applied to challenging physical real-world tasks.

reinforcement-learning Reinforcement Learning (RL)

Riemannian Diffusion Schrödinger Bridge

no code implementations7 Jul 2022 James Thornton, Michael Hutchinson, Emile Mathieu, Valentin De Bortoli, Yee Whye Teh, Arnaud Doucet

Our proposed method generalizes Diffusion Schr\"odinger Bridge introduced in \cite{debortoli2021neurips} to the non-Euclidean setting and extends Riemannian score-based models beyond the first time reversal.

Density Estimation

When Does Re-initialization Work?

no code implementations20 Jun 2022 Sheheryar Zaidi, Tudor Berariu, Hyunjik Kim, Jörg Bornschein, Claudia Clopath, Yee Whye Teh, Razvan Pascanu

However, when deployed alongside other carefully tuned regularization techniques, re-initialization methods offer little to no added benefit for generalization, although optimal generalization performance becomes less sensitive to the choice of learning rate and weight decay hyperparameters.

Data Augmentation Image Classification

Challenges and Opportunities in Offline Reinforcement Learning from Visual Observations

2 code implementations9 Jun 2022 Cong Lu, Philip J. Ball, Tim G. J. Rudner, Jack Parker-Holder, Michael A. Osborne, Yee Whye Teh

Using this suite of benchmarking tasks, we show that simple modifications to two popular vision-based online reinforcement learning algorithms, DreamerV2 and DrQ-v2, suffice to outperform existing offline RL methods and establish competitive baselines for continuous control in the visual domain.

Benchmarking Continuous Control +3

Conformal Off-Policy Prediction in Contextual Bandits

no code implementations9 Jun 2022 Muhammad Faaiz Taufiq, Jean-Francois Ton, Rob Cornish, Yee Whye Teh, Arnaud Doucet

Most off-policy evaluation methods for contextual bandits have focused on the expected outcome of a policy, which is estimated via methods that at best provide only asymptotic guarantees.

Conformal Prediction Multi-Armed Bandits +1

Pre-training via Denoising for Molecular Property Prediction

1 code implementation31 May 2022 Sheheryar Zaidi, Michael Schaarschmidt, James Martens, Hyunjik Kim, Yee Whye Teh, Alvaro Sanchez-Gonzalez, Peter Battaglia, Razvan Pascanu, Jonathan Godwin

Many important problems involving molecular property prediction from 3D structures have limited data, posing a generalization challenge for neural networks.

Denoising Molecular Property Prediction +1

Meta-Learning Sparse Compression Networks

no code implementations18 May 2022 Jonathan Richard Schwarz, Yee Whye Teh

Recent work in Deep Learning has re-imagined the representation of data as functions mapping from a coordinate space to an underlying continuous signal.

Meta-Learning

UncertaINR: Uncertainty Quantification of End-to-End Implicit Neural Representations for Computed Tomography

1 code implementation22 Feb 2022 Francisca Vasconcelos, Bobby He, Nalini Singh, Yee Whye Teh

To that end, we study a Bayesian reformulation of INRs, UncertaINR, in the context of computed tomography, and evaluate several Bayesian deep learning implementations in terms of accuracy and calibration.

Computed Tomography (CT) Decision Making +2

Riemannian Score-Based Generative Modelling

2 code implementations6 Feb 2022 Valentin De Bortoli, Emile Mathieu, Michael Hutchinson, James Thornton, Yee Whye Teh, Arnaud Doucet

Score-based generative models (SGMs) are a powerful class of generative models that exhibit remarkable empirical performance.

Denoising

COIN++: Neural Compression Across Modalities

1 code implementation30 Jan 2022 Emilien Dupont, Hrushikesh Loya, Milad Alizadeh, Adam Goliński, Yee Whye Teh, Arnaud Doucet

Neural compression algorithms are typically based on autoencoders that require specialized encoder and decoder architectures for different data modalities.

Vector-valued Gaussian Processes on Riemannian Manifolds via Gauge Independent Projected Kernels

no code implementations NeurIPS 2021 Michael Hutchinson, Alexander Terenin, Viacheslav Borovitskiy, So Takao, Yee Whye Teh, Marc Peter Deisenroth

Gaussian processes are machine learning models capable of learning unknown functions in a way that represents uncertainty, thereby facilitating construction of optimal decision-making systems.

BIG-bench Machine Learning Decision Making +2

Powerpropagation: A sparsity inducing weight reparameterisation

2 code implementations NeurIPS 2021 Jonathan Schwarz, Siddhant M. Jayakumar, Razvan Pascanu, Peter E. Latham, Yee Whye Teh

The training of sparse neural networks is becoming an increasingly important tool for reducing the computational footprint of models at training and evaluation, as well enabling the effective scaling up of models.

On Incorporating Inductive Biases into VAEs

1 code implementation ICLR 2022 Ning Miao, Emile Mathieu, N. Siddharth, Yee Whye Teh, Tom Rainforth

InteL-VAEs use an intermediary set of latent variables to control the stochasticity of the encoding process, before mapping these in turn to the latent representation using a parametric function that encapsulates our desired inductive bias(es).

Inductive Bias

On Contrastive Representations of Stochastic Processes

1 code implementation NeurIPS 2021 Emile Mathieu, Adam Foster, Yee Whye Teh

Learning representations of stochastic processes is an emerging problem in machine learning with applications from meta-learning to physical object models to time series.

Meta-Learning Time Series +1

Group Equivariant Subsampling

1 code implementation NeurIPS 2021 Jin Xu, Hyunjik Kim, Tom Rainforth, Yee Whye Teh

We use these layers to construct group equivariant autoencoders (GAEs) that allow us to learn low-dimensional equivariant representations.

Translation

BayesIMP: Uncertainty Quantification for Causal Data Fusion

no code implementations NeurIPS 2021 Siu Lun Chau, Jean-François Ton, Javier González, Yee Whye Teh, Dino Sejdinovic

While causal models are becoming one of the mainstays of machine learning, the problem of uncertainty quantification in causal inference remains challenging.

Bayesian Optimisation Causal Inference +1

COIN: COmpression with Implicit Neural representations

1 code implementation ICLR Workshop Neural_Compression 2021 Emilien Dupont, Adam Goliński, Milad Alizadeh, Yee Whye Teh, Arnaud Doucet

We propose a new simple approach for image compression: instead of storing the RGB values for each pixel of an image, we store the weights of a neural network overfitted to the image.

Data Compression Image Compression

Generative Models as Distributions of Functions

1 code implementation9 Feb 2021 Emilien Dupont, Yee Whye Teh, Arnaud Doucet

By treating data points as functions, we can abstract away from the specific type of data we train on and construct models that are agnostic to discretization.

LieTransformer: Equivariant self-attention for Lie Groups

1 code implementation20 Dec 2020 Michael Hutchinson, Charline Le Lan, Sheheryar Zaidi, Emilien Dupont, Yee Whye Teh, Hyunjik Kim

Group equivariant neural networks are used as building blocks of group invariant neural networks, which have been shown to improve generalisation performance and data efficiency through principled parameter sharing.

regression

Equivariant Learning of Stochastic Fields: Gaussian Processes and Steerable Conditional Neural Processes

1 code implementation25 Nov 2020 Peter Holderrieth, Michael Hutchinson, Yee Whye Teh

Motivated by objects such as electric fields or fluid streams, we study the problem of learning stochastic fields, i. e. stochastic processes whose samples are fields like those occurring in physics and engineering.

Gaussian Processes Transfer Learning

Amortized Probabilistic Detection of Communities in Graphs

2 code implementations29 Oct 2020 Yueqi Wang, Yoonho Lee, Pallab Basu, Juho Lee, Yee Whye Teh, Liam Paninski, Ari Pakman

While graph neural networks (GNNs) have been successful in encoding graph structures, existing GNN-based methods for community detection are limited by requiring knowledge of the number of communities in advance, in addition to lacking a proper probabilistic formulation to handle uncertainty.

Clustering Community Detection

Behavior Priors for Efficient Reinforcement Learning

no code implementations27 Oct 2020 Dhruva Tirumala, Alexandre Galashov, Hyeonwoo Noh, Leonard Hasenclever, Razvan Pascanu, Jonathan Schwarz, Guillaume Desjardins, Wojciech Marian Czarnecki, Arun Ahuja, Yee Whye Teh, Nicolas Heess

In this work we consider how information and architectural constraints can be combined with ideas from the probabilistic modeling literature to learn behavior priors that capture the common movement and interaction patterns that are shared across a set of related tasks or contexts.

Continuous Control Hierarchical Reinforcement Learning +3

Importance Weighted Policy Learning and Adaptation

no code implementations10 Sep 2020 Alexandre Galashov, Jakub Sygnowski, Guillaume Desjardins, Jan Humplik, Leonard Hasenclever, Rae Jeong, Yee Whye Teh, Nicolas Heess

The ability to exploit prior experience to solve novel problems rapidly is a hallmark of biological learning systems and of great practical importance for artificial ones.

Meta Reinforcement Learning reinforcement-learning +1

Bootstrapping Neural Processes

1 code implementation NeurIPS 2020 Juho Lee, Yoonho Lee, Jungtaek Kim, Eunho Yang, Sung Ju Hwang, Yee Whye Teh

While this "data-driven" way of learning stochastic processes has proven to handle various types of data, NPs still rely on an assumption that uncertainty in stochastic processes is modeled by a single latent variable, which potentially limits the flexibility.

Lottery Tickets in Linear Models: An Analysis of Iterative Magnitude Pruning

no code implementations16 Jul 2020 Bryn Elesedy, Varun Kanade, Yee Whye Teh

We analyse the pruning procedure behind the lottery ticket hypothesis arXiv:1803. 03635v5, iterative magnitude pruning (IMP), when applied to linear models trained by gradient flow.

Bayesian Deep Ensembles via the Neural Tangent Kernel

3 code implementations NeurIPS 2020 Bobby He, Balaji Lakshminarayanan, Yee Whye Teh

We explore the link between deep ensembles and Gaussian processes (GPs) through the lens of the Neural Tangent Kernel (NTK): a recent development in understanding the training dynamics of wide neural networks (NNs).

Gaussian Processes

Neural Ensemble Search for Uncertainty Estimation and Dataset Shift

1 code implementation NeurIPS 2021 Sheheryar Zaidi, Arber Zela, Thomas Elsken, Chris Holmes, Frank Hutter, Yee Whye Teh

On a variety of classification tasks and modern architecture search spaces, we show that the resulting ensembles outperform deep ensembles not only in terms of accuracy but also uncertainty calibration and robustness to dataset shift.

Image Classification Neural Architecture Search

Multiplicative Interactions and Where to Find Them

no code implementations ICLR 2020 Siddhant M. Jayakumar, Wojciech M. Czarnecki, Jacob Menick, Jonathan Schwarz, Jack Rae, Simon Osindero, Yee Whye Teh, Tim Harley, Razvan Pascanu

We explore the role of multiplicative interaction as a unifying framework to describe a range of classical and modern neural network architectural motifs, such as gating, attention layers, hypernetworks, and dynamic convolutions amongst others.

Inductive Bias

Non-exchangeable feature allocation models with sublinear growth of the feature sizes

no code implementations30 Mar 2020 Giuseppe Di Benedetto, François Caron, Yee Whye Teh

In particular, the Indian buffet process is a flexible and simple one-parameter feature allocation model where the number of features grows unboundedly with the number of objects.

Uncertainty Estimation Using a Single Deep Deterministic Neural Network

2 code implementations4 Mar 2020 Joost van Amersfoort, Lewis Smith, Yee Whye Teh, Yarin Gal

We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass.

Out-of-Distribution Detection Uncertainty Quantification

Robust Pruning at Initialization

no code implementations ICLR 2021 Soufiane Hayou, Jean-Francois Ton, Arnaud Doucet, Yee Whye Teh

Overparameterized Neural Networks (NN) display state-of-the-art performance.

MetaFun: Meta-Learning with Iterative Functional Updates

1 code implementation ICML 2020 Jin Xu, Jean-Francois Ton, Hyunjik Kim, Adam R. Kosiorek, Yee Whye Teh

We develop a functional encoder-decoder approach to supervised meta-learning, where labeled data is encoded into an infinite-dimensional functional representation rather than a finite-dimensional one.

Few-Shot Image Classification Meta-Learning

Continual Unsupervised Representation Learning

1 code implementation NeurIPS 2019 Dushyant Rao, Francesco Visin, Andrei A. Rusu, Yee Whye Teh, Razvan Pascanu, Raia Hadsell

Continual learning aims to improve the ability of modern learning systems to deal with non-stationary distributions, typically by attempting to learn a series of tasks sequentially.

Continual Learning Representation Learning

Efficient Bayesian Inference for Nested Simulators

no code implementations pproximateinference AABI Symposium 2019 Bradley Gram-Hansen, Christian Schroeder de Witt, Robert Zinkov, Saeid Naderiparizi, Adam Scibior, Andreas Munk, Frank Wood, Mehrdad Ghadiri, Philip Torr, Yee Whye Teh, Atilim Gunes Baydin, Tom Rainforth

We introduce two approaches for conducting efficient Bayesian inference in stochastic simulators containing nested stochastic sub-procedures, i. e., internal procedures for which the density cannot be calculated directly such as rejection sampling loops.

Bayesian Inference

Deep Amortized Clustering

no code implementations ICLR 2020 Juho Lee, Yoonho Lee, Yee Whye Teh

We propose a deep amortized clustering (DAC), a neural architecture which learns to cluster datasets efficiently using a few forward passes.

Clustering

Stacked Capsule Autoencoders

12 code implementations NeurIPS 2019 Adam R. Kosiorek, Sara Sabour, Yee Whye Teh, Geoffrey E. Hinton

In the second stage, SCAE predicts parameters of a few object capsules, which are then used to reconstruct part poses.

Cross-Modal Retrieval Object +1

Random Tessellation Forests

no code implementations NeurIPS 2019 Shufei Ge, Shijia Wang, Yee Whye Teh, Liangliang Wang, Lloyd T. Elliott

The Ostomachion process and the self-consistent binary space partitioning-tree process were recently introduced as generalizations of the Mondrian process for space partitioning with non-axis aligned cuts in the two dimensional plane.

Task Agnostic Continual Learning via Meta Learning

no code implementations ICML Workshop LifelongML 2020 Xu He, Jakub Sygnowski, Alexandre Galashov, Andrei A. Rusu, Yee Whye Teh, Razvan Pascanu

One particular formalism that studies learning under non-stationary distribution is provided by continual learning, where the non-stationarity is imposed by a sequence of distinct tasks.

Continual Learning Meta-Learning

Detecting Out-of-Distribution Inputs to Deep Generative Models Using Typicality

2 code implementations7 Jun 2019 Eric Nalisnick, Akihiro Matsukawa, Yee Whye Teh, Balaji Lakshminarayanan

To determine whether or not inputs reside in the typical set, we propose a statistically principled, easy-to-implement test using the empirical distribution of model likelihoods.

Noise Contrastive Meta-Learning for Conditional Density Estimation using Kernel Mean Embeddings

no code implementations5 Jun 2019 Jean-Francois Ton, Lucian Chan, Yee Whye Teh, Dino Sejdinovic

Current meta-learning approaches focus on learning functional representations of relationships between variables, i. e. on estimating conditional expectations in regression.

Density Estimation Meta-Learning +1

Hijacking Malaria Simulators with Probabilistic Programming

no code implementations29 May 2019 Bradley Gram-Hansen, Christian Schröder de Witt, Tom Rainforth, Philip H. S. Torr, Yee Whye Teh, Atılım Güneş Baydin

Epidemiology simulations have become a fundamental tool in the fight against the epidemics of various infectious diseases like AIDS and malaria.

Epidemiology Probabilistic Programming

Revisiting Reweighted Wake-Sleep

no code implementations ICLR 2019 Tuan Anh Le, Adam R. Kosiorek, N. Siddharth, Yee Whye Teh, Frank Wood

Discrete latent-variable models, while applicable in a variety of settings, can often be difficult to learn.

Infinitely Deep Infinite-Width Networks

no code implementations ICLR 2019 Jovana Mitrovic, Peter Wirnsberger, Charles Blundell, Dino Sejdinovic, Yee Whye Teh

Infinite-width neural networks have been extensively used to study the theoretical properties underlying the extraordinary empirical success of standard, finite-width neural networks.

Augmented Neural ODEs

6 code implementations NeurIPS 2019 Emilien Dupont, Arnaud Doucet, Yee Whye Teh

We show that Neural Ordinary Differential Equations (ODEs) learn representations that preserve the topology of the input space and prove that this implies the existence of functions Neural ODEs cannot represent.

Image Classification

Meta-Learning surrogate models for sequential decision making

no code implementations28 Mar 2019 Alexandre Galashov, Jonathan Schwarz, Hyunjik Kim, Marta Garnelo, David Saxton, Pushmeet Kohli, S. M. Ali Eslami, Yee Whye Teh

We introduce a unified probabilistic framework for solving sequential decision making problems ranging from Bayesian optimisation to contextual bandits and reinforcement learning.

Bayesian Optimisation Decision Making +4

Exploiting Hierarchy for Learning and Transfer in KL-regularized RL

no code implementations18 Mar 2019 Dhruva Tirumala, Hyeonwoo Noh, Alexandre Galashov, Leonard Hasenclever, Arun Ahuja, Greg Wayne, Razvan Pascanu, Yee Whye Teh, Nicolas Heess

As reinforcement learning agents are tasked with solving more challenging and diverse tasks, the ability to incorporate prior knowledge into the learning system and to exploit reusable structure in solution space is likely to become increasingly important.

Continuous Control reinforcement-learning +1

Hybrid Models with Deep and Invertible Features

1 code implementation7 Feb 2019 Eric Nalisnick, Akihiro Matsukawa, Yee Whye Teh, Dilan Gorur, Balaji Lakshminarayanan

We propose a neural hybrid model consisting of a linear model defined on a set of features computed by a deep, invertible transformation (i. e. a normalizing flow).

Probabilistic Deep Learning

Functional Regularisation for Continual Learning with Gaussian Processes

1 code implementation ICLR 2020 Michalis K. Titsias, Jonathan Schwarz, Alexander G. de G. Matthews, Razvan Pascanu, Yee Whye Teh

We introduce a framework for Continual Learning (CL) based on Bayesian inference over the function space rather than the parameters of a deep neural network.

Bayesian Inference Continual Learning +2

Probabilistic symmetries and invariant neural networks

no code implementations18 Jan 2019 Benjamin Bloem-Reddy, Yee Whye Teh

Treating neural network inputs and outputs as random variables, we characterize the structure of neural networks that can be used to model data that are invariant or equivariant under the action of a compact group.

Attentive Neural Processes

7 code implementations ICLR 2019 Hyunjik Kim, andriy mnih, Jonathan Schwarz, Marta Garnelo, Ali Eslami, Dan Rosenbaum, Oriol Vinyals, Yee Whye Teh

Neural Processes (NPs) (Garnelo et al 2018a;b) approach regression by learning to map a context set of observed input-output pairs to a distribution over regression functions.

regression

Continuous Hierarchical Representations with Poincaré Variational Auto-Encoders

4 code implementations NeurIPS 2019 Emile Mathieu, Charline Le Lan, Chris J. Maddison, Ryota Tomioka, Yee Whye Teh

We therefore endow VAEs with a Poincar\'e ball model of hyperbolic geometry as a latent space and rigorously derive the necessary methods to work with two main Gaussian generalisations on that space.

Disentangling Disentanglement in Variational Autoencoders

1 code implementation6 Dec 2018 Emile Mathieu, Tom Rainforth, N. Siddharth, Yee Whye Teh

We develop a generalisation of disentanglement in VAEs---decomposition of the latent representation---characterising it as the fulfilment of two factors: a) the latent encodings of the data having an appropriate level of overlap, and b) the aggregate encoding of the data conforming to a desired structure, represented through the prior.

Clustering Disentanglement

Stochastic Expectation Maximization with Variance Reduction

no code implementations NeurIPS 2018 Jianfei Chen, Jun Zhu, Yee Whye Teh, Tong Zhang

However, sEM has a slower asymptotic convergence rate than batch EM, and requires a decreasing sequence of step sizes, which is difficult to tune.

Neural probabilistic motor primitives for humanoid control

no code implementations ICLR 2019 Josh Merel, Leonard Hasenclever, Alexandre Galashov, Arun Ahuja, Vu Pham, Greg Wayne, Yee Whye Teh, Nicolas Heess

We focus on the problem of learning a single motor module that can flexibly express a range of behaviors for the control of high-dimensional physically simulated humanoids.

Humanoid Control

A Statistical Approach to Assessing Neural Network Robustness

1 code implementation ICLR 2019 Stefan Webb, Tom Rainforth, Yee Whye Teh, M. Pawan Kumar

Furthermore, it provides an ability to scale to larger networks than formal verification approaches.

On Exploration, Exploitation and Learning in Adaptive Importance Sampling

no code implementations31 Oct 2018 Xiaoyu Lu, Tom Rainforth, Yuan Zhou, Jan-Willem van de Meent, Yee Whye Teh

We study adaptive importance sampling (AIS) as an online learning problem and argue for the importance of the trade-off between exploration and exploitation in this adaptation.

Do Deep Generative Models Know What They Don't Know?

4 code implementations ICLR 2019 Eric Nalisnick, Akihiro Matsukawa, Yee Whye Teh, Dilan Gorur, Balaji Lakshminarayanan

A neural network deployed in the wild may be asked to make predictions for inputs that were drawn from a different distribution than that of the training data.

Set Transformer: A Framework for Attention-based Permutation-Invariant Neural Networks

9 code implementations1 Oct 2018 Juho Lee, Yoonho Lee, Jungtaek Kim, Adam R. Kosiorek, Seungjin Choi, Yee Whye Teh

Many machine learning tasks such as multiple instance learning, 3D shape recognition, and few-shot image classification are defined on sets of instances.

3D Shape Recognition Few-Shot Image Classification +1

Hamiltonian Descent Methods

4 code implementations13 Sep 2018 Chris J. Maddison, Daniel Paulin, Yee Whye Teh, Brendan O'Donoghue, Arnaud Doucet

Yet, crucially the kinetic gradient map can be designed to incorporate information about the convex conjugate in a fashion that allows for linear convergence on convex functions that may be non-smooth or non-strongly convex.

Sampling and Inference for Beta Neutral-to-the-Left Models of Sparse Networks

1 code implementation9 Jul 2018 Benjamin Bloem-Reddy, Adam Foster, Emile Mathieu, Yee Whye Teh

Empirical evidence suggests that heavy-tailed degree distributions occurring in many real networks are well-approximated by power laws with exponents $\eta$ that may take values either less than and greater than two.

Neural Processes

13 code implementations4 Jul 2018 Marta Garnelo, Jonathan Schwarz, Dan Rosenbaum, Fabio Viola, Danilo J. Rezende, S. M. Ali Eslami, Yee Whye Teh

A neural network (NN) is a parameterised function that can be tuned via gradient descent to approximate a labelled collection of data with high precision.

Mix & Match - Agent Curricula for Reinforcement Learning

no code implementations ICML 2018 Wojciech Czarnecki, Siddhant Jayakumar, Max Jaderberg, Leonard Hasenclever, Yee Whye Teh, Nicolas Heess, Simon Osindero, Razvan Pascanu

We introduce Mix and match (M&M) – a training framework designed to facilitate rapid and effective learning in RL agents that would be too slow or too challenging to train otherwise. The key innovation is a procedure that allows us to automatically form a curriculum over agents.

reinforcement-learning Reinforcement Learning (RL)

Inference Trees: Adaptive Inference with Exploration

no code implementations25 Jun 2018 Tom Rainforth, Yuan Zhou, Xiaoyu Lu, Yee Whye Teh, Frank Wood, Hongseok Yang, Jan-Willem van de Meent

We introduce inference trees (ITs), a new class of inference methods that build on ideas from Monte Carlo tree search to perform adaptive sampling in a manner that balances exploration with exploitation, ensures consistency, and alleviates pathologies in existing adaptive methods.

Controllable Semantic Image Inpainting

no code implementations15 Jun 2018 Jin Xu, Yee Whye Teh

We develop a method for user-controllable semantic image inpainting: Given an arbitrary set of observed pixels, the unobserved pixels can be imputed in a user-controllable range of possibilities, each of which is semantically coherent and locally consistent with the observed pixels.

Image Inpainting

Sequential Attend, Infer, Repeat: Generative Modelling of Moving Objects

1 code implementation NeurIPS 2018 Adam R. Kosiorek, Hyunjik Kim, Ingmar Posner, Yee Whye Teh

It can reliably discover and track objects throughout the sequence of frames, and can also generate future frames conditioning on the current frame, thereby simulating expected motion of objects.

Revisiting Reweighted Wake-Sleep for Models with Stochastic Control Flow

1 code implementation ICLR 2019 Tuan Anh Le, Adam R. Kosiorek, N. Siddharth, Yee Whye Teh, Frank Wood

Stochastic control-flow models (SCFMs) are a class of generative models that involve branching on choices from discrete random variables.

Progress & Compress: A scalable framework for continual learning

no code implementations ICML 2018 Jonathan Schwarz, Jelena Luketina, Wojciech M. Czarnecki, Agnieszka Grabska-Barwinska, Yee Whye Teh, Razvan Pascanu, Raia Hadsell

This is achieved by training a network with two components: A knowledge base, capable of solving previously encountered problems, which is connected to an active column that is employed to efficiently learn the current task.

Active Learning Atari Games +1

An Analysis of Categorical Distributional Reinforcement Learning

no code implementations22 Feb 2018 Mark Rowland, Marc G. Bellemare, Will Dabney, Rémi Munos, Yee Whye Teh

Distributional approaches to value-based reinforcement learning model the entire distribution of returns, rather than just their expected values, and have recently been shown to yield state-of-the-art empirical performance.

Distributional Reinforcement Learning reinforcement-learning +1

Tighter Variational Bounds are Not Necessarily Better

3 code implementations ICML 2018 Tom Rainforth, Adam R. Kosiorek, Tuan Anh Le, Chris J. Maddison, Maximilian Igl, Frank Wood, Yee Whye Teh

We provide theoretical and empirical evidence that using tighter evidence lower bounds (ELBOs) can be detrimental to the process of learning an inference network by reducing the signal-to-noise ratio of the gradient estimator.

Faithful Inversion of Generative Models for Effective Amortized Inference

no code implementations NeurIPS 2018 Stefan Webb, Adam Golinski, Robert Zinkov, N. Siddharth, Tom Rainforth, Yee Whye Teh, Frank Wood

Inference amortization methods share information across multiple posterior-inference problems, allowing each to be carried out more efficiently.

Non-exchangeable random partition models for microclustering

no code implementations20 Nov 2017 Giuseppe Di Benedetto, François Caron, Yee Whye Teh

Along with this result, we provide the asymptotic behaviour of the number of clusters of a given size, and show that the model can exhibit a power-law behavior, controlled by another parameter.

Clustering

Filtering Variational Objectives

3 code implementations NeurIPS 2017 Chris J. Maddison, Dieterich Lawson, George Tucker, Nicolas Heess, Mohammad Norouzi, andriy mnih, Arnaud Doucet, Yee Whye Teh

When used as a surrogate objective for maximum likelihood estimation in latent variable models, the evidence lower bound (ELBO) produces state-of-the-art results.

Poisson Random Fields for Dynamic Feature Models

no code implementations22 Nov 2016 Valerio Perrone, Paul A. Jenkins, Dario Spano, Yee Whye Teh

We present the Wright-Fisher Indian buffet process (WF-IBP), a probabilistic model for time-dependent data assumed to have been generated by an unknown number of latent features.

Posterior Consistency for a Non-parametric Survival Model under a Gaussian Process Prior

no code implementations7 Nov 2016 Tamara Fernández, Yee Whye Teh

In this paper, we prove almost surely consistency of a Survival Analysis model, which puts a Gaussian process, mapped to the unit interval, as a prior on the so-called hazard function.

Statistics Theory Statistics Theory

A nonparametric HMM for genetic imputation and coalescent inference

no code implementations2 Nov 2016 Lloyd T. Elliott, Yee Whye Teh

We develop a new nonparametric model of genetic sequence data, based on the hierarchical Dirichlet process, which supports these self transitions and nonhomogeneity.

Imputation

The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables

5 code implementations2 Nov 2016 Chris J. Maddison, andriy mnih, Yee Whye Teh

The essence of the trick is to refactor each stochastic node into a differentiable function of its parameters and a random variable with fixed distribution.

Density Estimation Structured Prediction

Poisson intensity estimation with reproducing kernels

no code implementations27 Oct 2016 Seth Flaxman, Yee Whye Teh, Dino Sejdinovic

However, we prove that the representer theorem does hold in an appropriately transformed RKHS, guaranteeing that the optimization of the penalized likelihood can be cast as a tractable finite-dimensional problem.

Relativistic Monte Carlo

no code implementations14 Sep 2016 Xiaoyu Lu, Valerio Perrone, Leonard Hasenclever, Yee Whye Teh, Sebastian J. Vollmer

Based on this, we develop relativistic stochastic gradient descent by taking the zero-temperature limit of relativistic stochastic gradient Hamiltonian Monte Carlo.

A characterization of product-form exchangeable feature probability functions

no code implementations7 Jul 2016 Marco Battiston, Stefano Favaro, Daniel M. Roy, Yee Whye Teh

We characterize the class of exchangeable feature allocations assigning probability $V_{n, k}\prod_{l=1}^{k}W_{m_{l}}U_{n-m_{l}}$ to a feature allocation of $n$ individuals, displaying $k$ features with counts $(m_{1},\ldots, m_{k})$ for these features.

Bayesian Nonparametrics for Sparse Dynamic Networks

no code implementations6 Jul 2016 Cian Naik, Francois Caron, Judith Rousseau, Yee Whye Teh, Konstantina Palla

In this paper we propose a Bayesian nonparametric approach to modelling sparse time-varying networks.

The Mondrian Kernel

no code implementations16 Jun 2016 Matej Balog, Balaji Lakshminarayanan, Zoubin Ghahramani, Daniel M. Roy, Yee Whye Teh

We introduce the Mondrian kernel, a fast random feature approximation to the Laplace kernel.

Collaborative Filtering with Side Information: a Gaussian Process Perspective

no code implementations23 May 2016 Hyunjik Kim, Xiaoyu Lu, Seth Flaxman, Yee Whye Teh

We tackle the problem of collaborative filtering (CF) with side information, through the lens of Gaussian Process (GP) regression.

Collaborative Filtering regression

DR-ABC: Approximate Bayesian Computation with Kernel-Based Distribution Regression

no code implementations15 Feb 2016 Jovana Mitrovic, Dino Sejdinovic, Yee Whye Teh

Approximate Bayesian computation (ABC) is an inference framework that constructs an approximation to the true likelihood based on the similarity between the observed and simulated data as measured by a predefined set of summary statistics.

regression

Distributed Bayesian Learning with Stochastic Natural-gradient Expectation Propagation and the Posterior Server

no code implementations31 Dec 2015 Leonard Hasenclever, Stefan Webb, Thibaut Lienart, Sebastian Vollmer, Balaji Lakshminarayanan, Charles Blundell, Yee Whye Teh

The posterior server allows scalable and robust Bayesian learning in cases where a data set is stored in a distributed manner across a cluster, with each compute node containing a disjoint subset of data.

Variational Inference

The Mondrian Process for Machine Learning

1 code implementation18 Jul 2015 Matej Balog, Yee Whye Teh

We outline a slight adaptation of this algorithm to regression, as the remainder of the report uses regression as a case study of how Mondrian processes can be utilized in machine learning.

BIG-bench Machine Learning regression

Expectation Particle Belief Propagation

1 code implementation NeurIPS 2015 Thibaut Lienart, Yee Whye Teh, Arnaud Doucet

The computational complexity of our algorithm at each iteration is quadratic in the number of particles.

Particle Gibbs for Bayesian Additive Regression Trees

no code implementations16 Feb 2015 Balaji Lakshminarayanan, Daniel M. Roy, Yee Whye Teh

Additive regression trees are flexible non-parametric models and popular off-the-shelf tools for real-world non-linear regression.

regression

Distributed Bayesian Posterior Sampling via Moment Sharing

no code implementations NeurIPS 2014 Minjie Xu, Balaji Lakshminarayanan, Yee Whye Teh, Jun Zhu, Bo Zhang

We propose a distributed Markov chain Monte Carlo (MCMC) inference algorithm for large scale Bayesian posterior simulation.

regression

Consistency and fluctuations for stochastic gradient Langevin dynamics

no code implementations1 Sep 2014 Yee Whye Teh, Alexandre Thiéry, Sebastian Vollmer

Applying standard Markov chain Monte Carlo (MCMC) algorithms to large data sets is computationally expensive.

Bayesian Nonparametric Crowdsourcing

no code implementations18 Jul 2014 Pablo G. Moreno, Yee Whye Teh, Fernando Perez-Cruz, Antonio Artés-Rodríguez

Crowdsourcing has been proven to be an effective and efficient tool to annotate large datasets.

Active Learning

A marginal sampler for $σ$-Stable Poisson-Kingman mixture models

no code implementations16 Jul 2014 María Lomelí, Stefano Favaro, Yee Whye Teh

We investigate the class of $\sigma$-stable Poisson-Kingman random probability measures (RPMs) in the context of Bayesian nonparametric mixture modeling.

Clustering Density Estimation

Asynchronous Anytime Sequential Monte Carlo

no code implementations NeurIPS 2014 Brooks Paige, Frank Wood, Arnaud Doucet, Yee Whye Teh

We introduce a new sequential Monte Carlo algorithm we call the particle cascade.

Mondrian Forests: Efficient Online Random Forests

2 code implementations NeurIPS 2014 Balaji Lakshminarayanan, Daniel M. Roy, Yee Whye Teh

Ensembles of randomized decision trees, usually referred to as random forests, are widely used for classification and regression tasks in machine learning and statistics.

Adaptive Reconfiguration Moves for Dirichlet Mixtures

no code implementations31 May 2014 Tue Herlau, Morten Mørup, Yee Whye Teh, Mikkel N. Schmidt

Bayesian mixture models are widely applied for unsupervised learning and exploratory data analysis.

Bayesian Hierarchical Community Discovery

no code implementations NeurIPS 2013 Charles Blundell, Yee Whye Teh

We propose an efficient Bayesian nonparametric model for discovering hierarchical community structure in social networks.

Model Selection

Learning with Invariance via Linear Functionals on Reproducing Kernel Hilbert Space

no code implementations NeurIPS 2013 Xinhua Zhang, Wee Sun Lee, Yee Whye Teh

For the representer theorem to hold, the linear functionals are required to be bounded in the RKHS, and we show that this is true for a variety of commonly used RKHS and invariances.

Stochastic Gradient Riemannian Langevin Dynamics on the Probability Simplex

no code implementations NeurIPS 2013 Sam Patterson, Yee Whye Teh

In this paper we investigate the use of Langevin Monte Carlo methods on the probability simplex and propose a new method, Stochastic gradient Riemannian Langevin dynamics, which is simple to implement and can be applied online.

Inferring ground truth from multi-annotator ordinal data: a probabilistic approach

no code implementations30 Apr 2013 Balaji Lakshminarayanan, Yee Whye Teh

A popular approach for large scale data annotation tasks is crowdsourcing, wherein each data point is labeled by multiple noisy annotators.

Bayesian Inference

Top-down particle filtering for Bayesian decision trees

no code implementations3 Mar 2013 Balaji Lakshminarayanan, Daniel M. Roy, Yee Whye Teh

Unlike classic decision tree learning algorithms like ID3, C4. 5 and CART, which work in a top-down manner, existing Bayesian algorithms produce an approximation to the posterior distribution by evolving a complete tree (or collection thereof) iteratively via local Monte Carlo modifications to the structure of the tree, e. g., using Markov chain Monte Carlo (MCMC).

On Smoothing and Inference for Topic Models

1 code implementation9 May 2012 Arthur Asuncion, Max Welling, Padhraic Smyth, Yee Whye Teh

Latent Dirichlet analysis, or topic modeling, is a flexible latent variable framework for modeling high-dimensional sparse count data.

Topic Models Variational Inference

Bayesian Learning via Stochastic Gradient Langevin Dynamics

1 code implementation ICML 2011 2011 Max Welling, Yee Whye Teh

In this paper we propose a new framework for learning from large scale datasets based on iterative learning from small mini-batches.

regression

Cannot find the paper you are looking for? You can Submit a new open access paper.