Search Results for author: Alexander Immer

Found 22 papers, 12 papers with code

Shaving Weights with Occam's Razor: Bayesian Sparsification for Neural Networks Using the Marginal Likelihood

no code implementations • 25 Feb 2024 • Rayen Dhahri, Alexander Immer, Betrand Charpentier, Stephan Günnemann, Vincent Fortuin

Neural network sparsification is a promising avenue to save computational time and memory costs, especially in an age where many successful AI models are becoming too large to na\"ively deploy on consumer hardware.

Paper
Add Code

Position Paper: Bayesian Deep Learning in the Age of Large-Scale AI

no code implementations • 1 Feb 2024 • Theodore Papamarkou, Maria Skoularidou, Konstantina Palla, Laurence Aitchison, Julyan Arbel, David Dunson, Maurizio Filippone, Vincent Fortuin, Philipp Hennig, Jose Miguel Hernandez Lobato, Aliaksandr Hubin, Alexander Immer, Theofanis Karaletsos, Mohammad Emtiyaz Khan, Agustinus Kristiadi, Yingzhen Li, Stephan Mandt, Christopher Nemeth, Michael A. Osborne, Tim G. J. Rudner, David Rügamer, Yee Whye Teh, Max Welling, Andrew Gordon Wilson, Ruqi Zhang

In the current landscape of deep learning research, there is a predominant emphasis on achieving high predictive accuracy in supervised tasks involving large image and language datasets.

Continual Learning Position

Paper
Add Code

Uncertainty in Graph Contrastive Learning with Bayesian Neural Networks

no code implementations • 30 Nov 2023 • Alexander Möllers, Alexander Immer, Elvin Isufi, Vincent Fortuin

Graph contrastive learning has shown great promise when labeled data is scarce, but large unlabeled datasets are available.

Contrastive Learning Node Classification

Paper
Add Code

Kronecker-Factored Approximate Curvature for Modern Neural Network Architectures

no code implementations • NeurIPS 2023 • Runa Eschenhagen, Alexander Immer, Richard E. Turner, Frank Schneider, Philipp Hennig

In this work, we identify two different settings of linear weight-sharing layers which motivate two flavours of K-FAC -- $\textit{expand}$ and $\textit{reduce}$.

Paper
Add Code

Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion

1 code implementation • 3 Oct 2023 • Alexandru Meterez, Amir Joudaki, Francesco Orabona, Alexander Immer, Gunnar Rätsch, Hadi Daneshmand

We answer this question in the affirmative by giving a particular construction of an Multi-Layer Perceptron (MLP) with linear activations and batch-normalization that provably has bounded gradients at any depth.

Paper
Code

Hodge-Aware Contrastive Learning

no code implementations • 14 Sep 2023 • Alexander Möllers, Alexander Immer, Vincent Fortuin, Elvin Isufi

We leverage this decomposition to develop a contrastive self-supervised learning approach for processing simplicial data and generating embeddings that encapsulate specific spectral information. Specifically, we encode the pertinent data invariances through simplicial neural networks and devise augmentations that yield positive contrastive examples with suitable spectral properties for downstream tasks.

Contrastive Learning Self-Supervised Learning

Paper
Add Code

Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels

1 code implementation • 6 Jun 2023 • Alexander Immer, Tycho F. A. van der Ouderaa, Mark van der Wilk, Gunnar Rätsch, Bernhard Schölkopf

Recent works show that Bayesian model selection with Laplace approximations can allow to optimize such hyperparameters just like standard neural network parameters using gradients and on the training data.

Hyperparameter Optimization Model Selection

Paper
Code

Improving Neural Additive Models with Bayesian Principles

no code implementations • 26 May 2023 • Kouroche Bouchiat, Alexander Immer, Hugo Yèche, Gunnar Rätsch, Vincent Fortuin

Neural additive models (NAMs) enhance the transparency of deep neural networks by handling input features in separate additive sub-networks.

Additive models Bayesian Inference +1

Paper
Add Code

Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization

1 code implementation • 17 Apr 2023 • Agustinus Kristiadi, Alexander Immer, Runa Eschenhagen, Vincent Fortuin

The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks.

Bayesian Optimization Decision Making +2

Paper
Code

On the Identifiability and Estimation of Causal Location-Scale Noise Models

1 code implementation • 13 Oct 2022 • Alexander Immer, Christoph Schultheiss, Julia E. Vogt, Bernhard Schölkopf, Peter Bühlmann, Alexander Marx

We study the class of location-scale or heteroscedastic noise models (LSNMs), in which the effect $Y$ can be written as a function of the cause $X$ and a noise source $N$ independent of $X$, which may be scaled by a positive function $g$ over the cause, i. e., $Y = f(X) + g(X)N$.

Causal Discovery Causal Inference

Paper
Code

Invariance Learning in Deep Neural Networks with Differentiable Laplace Approximations

1 code implementation • 22 Feb 2022 • Alexander Immer, Tycho F. A. van der Ouderaa, Gunnar Rätsch, Vincent Fortuin, Mark van der Wilk

We develop a convenient gradient-based method for selecting the data augmentation without validation data during training of a deep neural network.

Data Augmentation Gaussian Processes +1

Paper
Code

Probing as Quantifying Inductive Bias

1 code implementation • ACL 2022 • Alexander Immer, Lucas Torroba Hennigen, Vincent Fortuin, Ryan Cotterell

Such performance improvements have motivated researchers to quantify and understand the linguistic information encoded in these representations.

Bayesian Inference Inductive Bias

Paper
Code

Pathologies in priors and inference for Bayesian transformers

no code implementations • NeurIPS Workshop ICBINB 2021 • Tristan Cinquin, Alexander Immer, Max Horn, Vincent Fortuin

In recent years, the transformer has established itself as a workhorse in many applications ranging from natural language processing to reinforcement learning.

Bayesian Inference Variational Inference

Paper
Add Code

Laplace Redux -- Effortless Bayesian Deep Learning

3 code implementations • NeurIPS 2021 • Erik Daxberger, Agustinus Kristiadi, Alexander Immer, Runa Eschenhagen, Matthias Bauer, Philipp Hennig

Bayesian formulations of deep learning have been shown to have compelling theoretical properties and offer practical functional benefits, such as improved predictive uncertainty quantification and model selection.

Misconceptions Model Selection +1

418

Paper
Code

Laplace Redux - Effortless Bayesian Deep Learning

no code implementations • NeurIPS 2021 • Erik Daxberger, Agustinus Kristiadi, Alexander Immer, Runa Eschenhagen, Matthias Bauer, Philipp Hennig

Misconceptions Model Selection +1

Paper
Add Code

Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning

1 code implementation • 11 Apr 2021 • Alexander Immer, Matthias Bauer, Vincent Fortuin, Gunnar Rätsch, Mohammad Emtiyaz Khan

Marginal-likelihood based model-selection, even though promising, is rarely used in deep learning due to estimation difficulties.

Image Classification Model Selection +2

Paper
Code

Improving predictions of Bayesian neural nets via local linearization

1 code implementation • 19 Aug 2020 • Alexander Immer, Maciej Korzepa, Matthias Bauer

The generalized Gauss-Newton (GGN) approximation is often used to make practical Bayesian deep learning approaches scalable by replacing a second order derivative with a product of first order derivatives.

Out-of-Distribution Detection

Paper
Code

Disentangling the Gauss-Newton Method and Approximate Inference for Neural Networks

no code implementations • 21 Jul 2020 • Alexander Immer

Algorithms that combine the Gauss-Newton method with the Laplace and Gaussian variational approximation have recently led to state-of-the-art results in Bayesian deep learning.

Gaussian Processes

Paper
Add Code

Continual Deep Learning by Functional Regularisation of Memorable Past

1 code implementation • NeurIPS 2020 • Pingbo Pan, Siddharth Swaroop, Alexander Immer, Runa Eschenhagen, Richard E. Turner, Mohammad Emtiyaz Khan

Continually learning new skills is important for intelligent systems, yet standard deep learning methods suffer from catastrophic forgetting of the past.

Paper
Code

Variational Inference with Numerical Derivatives: variance reduction through coupling

1 code implementation • 17 Jun 2019 • Alexander Immer, Guillaume P. Dehaene

The Black Box Variational Inference (Ranganath et al. (2014)) algorithm provides a universal method for Variational Inference, but taking advantage of special properties of the approximation family or of the target can improve the convergence speed significantly.

Variational Inference

Paper
Code

Approximate Inference Turns Deep Networks into Gaussian Processes

1 code implementation • NeurIPS 2019 • Mohammad Emtiyaz Khan, Alexander Immer, Ehsan Abedi, Maciej Korzepa

Deep neural networks (DNN) and Gaussian processes (GP) are two powerful models with several theoretical connections relating them, but the relationship between their training methods is not well understood.

Gaussian Processes

Paper
Code

Generative Interest Estimation for Document Recommendations

no code implementations • 28 Nov 2017 • Danijar Hafner, Alexander Immer, Willi Raschkowski, Fabian Windheuser

Then, we capture a user's interest as a generative model in the space of the document representations.

Density Estimation Recommendation Systems

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.