Search Results for author: David Duvenaud

Found 59 papers, 39 papers with code

Experts Don't Cheat: Learning What You Don't Know By Predicting Pairs

no code implementations • 13 Feb 2024 • Daniel D. Johnson, Daniel Tarlow, David Duvenaud, Chris J. Maddison

Identifying how much a model ${\widehat{p}}_{\theta}(Y|X)$ knows about the stochastic real-world process $p(Y|X)$ it was trained on is important to ensure it avoids producing incorrect or "hallucinated" answers or taking unsafe actions.

Image Classification Language Modelling +1

Paper
Add Code

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

1 code implementation • 10 Jan 2024 • Evan Hubinger, Carson Denison, Jesse Mu, Mike Lambert, Meg Tong, Monte MacDiarmid, Tamera Lanham, Daniel M. Ziegler, Tim Maxwell, Newton Cheng, Adam Jermyn, Amanda Askell, Ansh Radhakrishnan, Cem Anil, David Duvenaud, Deep Ganguli, Fazl Barez, Jack Clark, Kamal Ndousse, Kshitij Sachan, Michael Sellitto, Mrinank Sharma, Nova DasSarma, Roger Grosse, Shauna Kravec, Yuntao Bai, Zachary Witten, Marina Favaro, Jan Brauner, Holden Karnofsky, Paul Christiano, Samuel R. Bowman, Logan Graham, Jared Kaplan, Sören Mindermann, Ryan Greenblatt, Buck Shlegeris, Nicholas Schiefer, Ethan Perez

We find that such backdoor behavior can be made persistent, so that it is not removed by standard safety training techniques, including supervised fine-tuning, reinforcement learning, and adversarial training (eliciting unsafe behavior and then training to remove it).

Paper
Code

Sorting Out Quantum Monte Carlo

no code implementations • 9 Nov 2023 • Jack Richter-Powell, Luca Thiede, Alán Asparu-Guzik, David Duvenaud

Molecular modeling at the quantum level requires choosing a parameterization of the wavefunction that both respects the required particle symmetries, and is scalable to systems of many particles.

valid

Paper
Add Code

Towards Understanding Sycophancy in Language Models

1 code implementation • 20 Oct 2023 • Mrinank Sharma, Meg Tong, Tomasz Korbak, David Duvenaud, Amanda Askell, Samuel R. Bowman, Newton Cheng, Esin Durmus, Zac Hatfield-Dodds, Scott R. Johnston, Shauna Kravec, Timothy Maxwell, Sam McCandlish, Kamal Ndousse, Oliver Rausch, Nicholas Schiefer, Da Yan, Miranda Zhang, Ethan Perez

Overall, our results indicate that sycophancy is a general behavior of state-of-the-art AI assistants, likely driven in part by human preference judgments favoring sycophantic responses.

Text Generation

Paper
Code

On Implicit Bias in Overparameterized Bilevel Optimization

no code implementations • 28 Dec 2022 • Paul Vicol, Jonathan Lorraine, Fabian Pedregosa, David Duvenaud, Roger Grosse

Bilevel problems consist of two nested sub-problems, called the outer and inner problems, respectively.

Bilevel Optimization Hyperparameter Optimization +1

Paper
Add Code

Meta-Learning to Improve Pre-Training

no code implementations • NeurIPS 2021 • Aniruddh Raghu, Jonathan Lorraine, Simon Kornblith, Matthew McDermott, David Duvenaud

Pre-training (PT) followed by fine-tuning (FT) is an effective method for training neural networks, and has led to significant performance improvements in many domains.

Data Augmentation Hyperparameter Optimization +1

Paper
Add Code

Complex Momentum for Optimization in Games

no code implementations • 16 Feb 2021 • Jonathan Lorraine, David Acuna, Paul Vicol, David Duvenaud

We generalize gradient descent with momentum for optimization in differentiable games to have complex-valued momentum.

Paper
Add Code

Infinitely Deep Bayesian Neural Networks with Stochastic Differential Equations

2 code implementations • 12 Feb 2021 • Winnie Xu, Ricky T. Q. Chen, Xuechen Li, David Duvenaud

We perform scalable approximate inference in continuous-depth Bayesian neural networks.

Variational Inference

154

Paper
Code

Oops I Took A Gradient: Scalable Sampling for Discrete Distributions

1 code implementation • 8 Feb 2021 • Will Grathwohl, Kevin Swersky, Milad Hashemi, David Duvenaud, Chris J. Maddison

We propose a general and scalable approximate sampling strategy for probabilistic models with discrete variables.

Paper
Code

Self-Tuning Stochastic Optimization with Curvature-Aware Gradient Filtering

no code implementations • NeurIPS Workshop ICBINB 2020 • Ricky T. Q. Chen, Dami Choi, Lukas Balles, David Duvenaud, Philipp Hennig

Standard first-order stochastic optimization algorithms base their updates solely on the average mini-batch gradient, and it has been shown that tracking additional quantities such as the curvature can help de-sensitize common hyperparameters.

Stochastic Optimization

Paper
Add Code

Teaching with Commentaries

1 code implementation • ICLR 2021 • Aniruddh Raghu, Maithra Raghu, Simon Kornblith, David Duvenaud, Geoffrey Hinton

We find that commentaries can improve training speed and/or performance, and provide insights about the dataset and training process.

Data Augmentation

Paper
Code

No MCMC for me: Amortized sampling for fast and stable training of energy-based models

1 code implementation • ICLR 2021 • Will Grathwohl, Jacob Kelly, Milad Hashemi, Mohammad Norouzi, Kevin Swersky, David Duvenaud

Energy-Based Models (EBMs) present a flexible and appealing way to represent uncertainty.

Paper
Code

Learning Differential Equations that are Easy to Solve

1 code implementation • NeurIPS 2020 • Jacob Kelly, Jesse Bettencourt, Matthew James Johnson, David Duvenaud

Differential equations parameterized by neural networks become expensive to solve numerically as training progresses.

Density Estimation Time Series +1

262

Paper
Code

A Study of Gradient Variance in Deep Learning

1 code implementation • 9 Jul 2020 • Fartash Faghri, David Duvenaud, David J. Fleet, Jimmy Ba

We introduce a method, Gradient Clustering, to minimize the variance of average mini-batch gradient with stratified sampling.

Clustering

Paper
Code

SUMO: Unbiased Estimation of Log Marginal Probability for Latent Variable Models

no code implementations • ICLR 2020 • Yucen Luo, Alex Beatson, Mohammad Norouzi, Jun Zhu, David Duvenaud, Ryan P. Adams, Ricky T. Q. Chen

Standard variational lower bounds used to train latent variable models produce biased estimates of most quantities of interest.

Paper
Add Code

What went wrong and when? Instance-wise Feature Importance for Time-series Models

no code implementations • 5 Mar 2020 • Sana Tonekaboni, Shalmali Joshi, Kieran Campbell, David Duvenaud, Anna Goldenberg

Explanations of time series models are useful for high stakes applications like healthcare but have received little attention in machine learning literature.

counterfactual Feature Importance +2

Paper
Add Code

Learning the Stein Discrepancy for Training and Evaluating Energy-Based Models without Sampling

1 code implementation • ICML 2020 • Will Grathwohl, Kuan-Chieh Wang, Jorn-Henrik Jacobsen, David Duvenaud, Richard Zemel

We estimate the Stein discrepancy between the data density $p(x)$ and the model density $q(x)$ defined by a vector function of the data.

Paper
Code

Scalable Gradients for Stochastic Differential Equations

4 code implementations • 5 Jan 2020 • Xuechen Li, Ting-Kam Leonard Wong, Ricky T. Q. Chen, David Duvenaud

The adjoint sensitivity method scalably computes gradients of solutions to ordinary differential equations.

Ranked #1 on Video Prediction on CMU Mocap-2

Variational Inference Video Prediction

1,461

Paper
Code

Neural Networks with Cheap Differential Operators

no code implementations • 8 Dec 2019 • Ricky T. Q. Chen, David Duvenaud

Gradients of neural networks can be computed efficiently for any architecture, but some applications require differential operators with higher time complexity.

Paper
Add Code

Your Classifier is Secretly an Energy Based Model and You Should Treat it Like One

4 code implementations • ICLR 2020 • Will Grathwohl, Kuan-Chieh Wang, Jörn-Henrik Jacobsen, David Duvenaud, Mohammad Norouzi, Kevin Swersky

In this setting, the standard class probabilities can be easily computed as well as unnormalized values of p(x) and p(x|y).

411

Paper
Code

Optimizing Millions of Hyperparameters by Implicit Differentiation

9 code implementations • 6 Nov 2019 • Jonathan Lorraine, Paul Vicol, David Duvenaud

We propose an algorithm for inexpensive gradient-based hyperparameter optimization that combines the implicit function theorem (IFT) with efficient inverse Hessian approximations.

Data Augmentation Hyperparameter Optimization

1,153

Paper
Code

Efficient Graph Generation with Graph Recurrent Attention Networks

2 code implementations • NeurIPS 2019 • Renjie Liao, Yujia Li, Yang Song, Shenlong Wang, Charlie Nash, William L. Hamilton, David Duvenaud, Raquel Urtasun, Richard S. Zemel

Our model generates graphs one block of nodes and associated edges at a time.

Graph Generation

450

Paper
Code

Explaining Time Series by Counterfactuals

no code implementations • 25 Sep 2019 • Sana Tonekaboni, Shalmali Joshi, David Duvenaud, Anna Goldenberg

We propose a method to automatically compute the importance of features at every observation in time series, by simulating counterfactual trajectories given previous observations.

counterfactual Feature Importance +2

Paper
Add Code

Understanding Undesirable Word Embedding Associations

no code implementations • ACL 2019 • Kawin Ethayarajh, David Duvenaud, Graeme Hirst

Word embeddings are often criticized for capturing undesirable word associations such as gender stereotypes.

Word Embeddings

Paper
Add Code

Latent ODEs for Irregularly-Sampled Time Series

12 code implementations • 8 Jul 2019 • Yulia Rubanova, Ricky T. Q. Chen, David Duvenaud

Time series with non-uniform intervals occur in many applications, and are difficult to model using standard recurrent neural networks (RNNs).

Ranked #1 on Multivariate Time Series Imputation on MuJoCo

Multivariate Time Series Forecasting Multivariate Time Series Imputation +3

482

Paper
Code

Residual Flows for Invertible Generative Modeling

4 code implementations • NeurIPS 2019 • Ricky T. Q. Chen, Jens Behrmann, David Duvenaud, Jörn-Henrik Jacobsen

Flow-based generative models parameterize probability distributions through an invertible transformation and can be trained by maximum likelihood.

Ranked #2 on Image Generation on MNIST

Density Estimation Image Generation

258

Paper
Code

Self-Tuning Networks: Bilevel Optimization of Hyperparameters using Structured Best-Response Functions

3 code implementations • ICLR 2019 • Matthew MacKay, Paul Vicol, Jon Lorraine, David Duvenaud, Roger Grosse

Empirically, our approach outperforms competing hyperparameter optimization methods on large-scale deep learning problems.

Bilevel Optimization Data Augmentation +1

Paper
Code

Invertible Residual Networks

5 code implementations • 2 Nov 2018 • Jens Behrmann, Will Grathwohl, Ricky T. Q. Chen, David Duvenaud, Jörn-Henrik Jacobsen

We show that standard ResNet architectures can be made invertible, allowing the same model to be used for classification, density estimation, and generation.

Ranked #5 on Image Generation on MNIST

Density Estimation General Classification +1

511

Paper
Code

Scalable Recommender Systemsthrough Recursive Evidence Chains

no code implementations • 20 Oct 2018 • Elias Tragas, Calvin Luo, Maxime Yvez, Kevin Luk, David Duvenaud

A popular matrix completion algorithm is matrix factorization, where ratings are predicted from combining learned user and item parameter vectors.

Matrix Completion Recommendation Systems

Paper
Add Code

Towards Understanding Linear Word Analogies

no code implementations • ACL 2019 • Kawin Ethayarajh, David Duvenaud, Graeme Hirst

A surprising property of word vectors is that word analogies can often be solved with vector arithmetic.

Paper
Add Code

FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models

7 code implementations • ICLR 2019 • Will Grathwohl, Ricky T. Q. Chen, Jesse Bettencourt, Ilya Sutskever, David Duvenaud

The result is a continuous-time invertible generative model with unbiased density estimation and one-pass sampling, while allowing unrestricted neural network architectures.

Ranked #1 on Density Estimation on UCI MINIBOONE

Density Estimation Image Generation +1

614

Paper
Code

Stochastic Combinatorial Ensembles for Defending Against Adversarial Examples

no code implementations • 20 Aug 2018 • George A. Adam, Petr Smirnov, David Duvenaud, Benjamin Haibe-Kains, Anna Goldenberg

Many deep learning algorithms can be easily fooled with simple adversarial examples.

Adversarial Attack Metric Learning

Paper
Add Code

Explaining Image Classifiers by Counterfactual Generation

1 code implementation • ICLR 2019 • Chun-Hao Chang, Elliot Creager, Anna Goldenberg, David Duvenaud

We can rephrase this question to ask: which parts of the image, if they were not seen by the classifier, would most change its decision?

counterfactual Image Classification

Paper
Code

Scalable Recommender Systems through Recursive Evidence Chains

no code implementations • 5 Jul 2018 • Elias Tragas, Calvin Luo, Maxime Gazeau, Kevin Luk, David Duvenaud

Recommender systems can be formulated as a matrix completion problem, predicting ratings from user and item parameter vectors.

Matrix Completion Recommendation Systems

Paper
Add Code

Neural Ordinary Differential Equations

55 code implementations • NeurIPS 2018 • Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, David Duvenaud

Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network.

Ranked #2 on Multivariate Time Series Imputation on MuJoCo

Multivariate Time Series Forecasting Multivariate Time Series Imputation

5,180

Paper
Code

Stochastic Hyperparameter Optimization through Hypernetworks

1 code implementation • ICLR 2018 • Jonathan Lorraine, David Duvenaud

Machine learning models are often tuned by nesting optimization of model weights inside the optimization of hyperparameters.

BIG-bench Machine Learning Hyperparameter Optimization +1

Paper
Code

Isolating Sources of Disentanglement in Variational Autoencoders

10 code implementations • NeurIPS 2018 • Ricky T. Q. Chen, Xuechen Li, Roger Grosse, David Duvenaud

We decompose the evidence lower bound to show the existence of a term measuring the total correlation between latent variables.

Disentanglement

1,675

Paper
Code

Inference Suboptimality in Variational Autoencoders

2 code implementations • ICML 2018 • Chris Cremer, Xuechen Li, David Duvenaud

Furthermore, we show that the parameters used to increase the expressiveness of the approximation play a role in generalizing inference rather than simply improving the complexity of the approximation.

Paper
Code

Generating and designing DNA with deep generative models

2 code implementations • 17 Dec 2017 • Nathan Killoran, Leo J. Lee, Andrew Delong, David Duvenaud, Brendan J. Frey

We propose generative neural network methods to generate DNA sequences and tune them to have desired properties.

Generative Adversarial Network

Paper
Code

Noisy Natural Gradient as Variational Inference

2 code implementations • ICML 2018 • Guodong Zhang, Shengyang Sun, David Duvenaud, Roger Grosse

Variational Bayesian neural nets combine the flexibility of deep learning with Bayesian uncertainty estimation.

Active Learning Efficient Exploration +2

Paper
Code

Backpropagation through the Void: Optimizing control variates for black-box gradient estimation

7 code implementations • ICLR 2018 • Will Grathwohl, Dami Choi, Yuhuai Wu, Geoffrey Roeder, David Duvenaud

Gradient-based optimization is the foundation of deep learning and reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

159

Paper
Code

Reinterpreting Importance-Weighted Autoencoders

no code implementations • 10 Apr 2017 • Chris Cremer, Quaid Morris, David Duvenaud

The standard interpretation of importance-weighted autoencoders is that they maximize a tighter lower bound on the marginal likelihood than the standard evidence lower bound.

Paper
Add Code

Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference

1 code implementation • NeurIPS 2017 • Geoffrey Roeder, Yuhuai Wu, David Duvenaud

We propose a simple and general variant of the standard reparameterized gradient estimator for the variational evidence lower bound.

Variational Inference

Paper
Code

Automatic chemical design using a data-driven continuous representation of molecules

11 code implementations • 7 Oct 2016 • Rafael Gómez-Bombarelli, Jennifer N. Wei, David Duvenaud, José Miguel Hernández-Lobato, Benjamín Sánchez-Lengeling, Dennis Sheberla, Jorge Aguilera-Iparraguirre, Timothy D. Hirzel, Ryan P. Adams, Alán Aspuru-Guzik

We report a method to convert discrete representations of molecules to and from a multidimensional continuous representation.

Efficient Exploration

454

Paper
Code

Neural networks for the prediction organic chemistry reactions

no code implementations • 22 Aug 2016 • Jennifer N. Wei, David Duvenaud, Alán Aspuru-Guzik

Reaction prediction remains one of the major challenges for organic chemistry, and is a pre-requisite for efficient synthetic planning.

Paper
Add Code

Composing graphical models with neural networks for structured representations and fast inference

3 code implementations • NeurIPS 2016 • Matthew J. Johnson, David Duvenaud, Alexander B. Wiltschko, Sandeep R. Datta, Ryan P. Adams

We propose a general modeling and inference framework that composes probabilistic graphical models with deep learning methods and combines their respective strengths.

Variational Inference

344

Paper
Code

Convolutional Networks on Graphs for Learning Molecular Fingerprints

8 code implementations • NeurIPS 2015 • David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre, Rafael Gómez-Bombarelli, Timothy Hirzel, Alán Aspuru-Guzik, Ryan P. Adams

We introduce a convolutional neural network that operates directly on graphs.

Ranked #2 on Drug Discovery on HIV dataset

Drug Discovery Graph Regression +1

486

Paper
Code

Early Stopping is Nonparametric Variational Inference

1 code implementation • 6 Apr 2015 • Dougal Maclaurin, David Duvenaud, Ryan P. Adams

By tracking the change in entropy over this sequence of transformations during optimization, we form a scalable, unbiased estimate of the variational lower bound on the log marginal likelihood.

Variational Inference

Paper
Code

Gradient-based Hyperparameter Optimization through Reversible Learning

2 code implementations • 11 Feb 2015 • Dougal Maclaurin, David Duvenaud, Ryan P. Adams

Tuning hyperparameters of learning algorithms is hard because gradients are usually unavailable.

Hyperparameter Optimization

293

Paper
Code

Raiders of the Lost Architecture: Kernels for Bayesian Optimization in Conditional Parameter Spaces

no code implementations • 14 Sep 2014 • Kevin Swersky, David Duvenaud, Jasper Snoek, Frank Hutter, Michael A. Osborne

In practical Bayesian optimization, we must often search over structures with differing numbers of parameters.

Bayesian Optimization

Paper
Add Code

Warped Mixtures for Nonparametric Cluster Shapes

1 code implementation • 9 Aug 2014 • Tomoharu Iwata, David Duvenaud, Zoubin Ghahramani

A mixture of Gaussians fit to a single curved or heavy-tailed cluster will report that the data contains many clusters.

Density Estimation

Paper
Code

Optimally-Weighted Herding is Bayesian Quadrature

no code implementations • 9 Aug 2014 • Ferenc Huszar, David Duvenaud

We show that the criterion minimised when selecting samples in kernel herding is equivalent to the posterior variance in Bayesian quadrature.

Paper
Add Code

Probabilistic ODE Solvers with Runge-Kutta Means

no code implementations • NeurIPS 2014 • Michael Schober, David Duvenaud, Philipp Hennig

We construct a family of probabilistic numerical methods that instead return a Gauss-Markov process defining a probability distribution over the ODE solution.

Paper
Add Code

Avoiding pathologies in very deep networks

2 code implementations • 24 Feb 2014 • David Duvenaud, Oren Rippel, Ryan P. Adams, Zoubin Ghahramani

Choosing appropriate architectures and regularization strategies for deep networks is crucial to good predictive performance.

Gaussian Processes

Paper
Code

Automatic Construction and Natural-Language Description of Nonparametric Regression Models

2 code implementations • 18 Feb 2014 • James Robert Lloyd, David Duvenaud, Roger Grosse, Joshua B. Tenenbaum, Zoubin Ghahramani

This paper presents the beginnings of an automatic statistician, focusing on regression problems.

Gaussian Processes regression +2

221

Paper
Code

Structure Discovery in Nonparametric Regression through Compositional Kernel Search

4 code implementations • 20 Feb 2013 • David Duvenaud, James Robert Lloyd, Roger Grosse, Joshua B. Tenenbaum, Zoubin Ghahramani

Despite its importance, choosing the structural form of the kernel in nonparametric regression remains a black art.

regression Time Series +1

221

Paper
Code

Warped Mixtures for Nonparametric Cluster Shapes

1 code implementation • 8 Jun 2012 • Tomoharu Iwata, David Duvenaud, Zoubin Ghahramani

A mixture of Gaussians fit to a single curved or heavy-tailed cluster will report that the data contains many clusters.

Density Estimation

Paper
Code

Optimally-Weighted Herding is Bayesian Quadrature

1 code implementation • 7 Apr 2012 • Ferenc Huszár, David Duvenaud

We show that the criterion minimised when selecting samples in kernel herding is equivalent to the posterior variance in Bayesian quadrature.

Paper
Code

Additive Gaussian Processes

1 code implementation • NeurIPS 2011 • David Duvenaud, Hannes Nickisch, Carl Edward Rasmussen

We introduce a Gaussian process model of functions which are additive.

Additive models Gaussian Processes +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.