no code implementations • 25 Feb 2024 • Rayen Dhahri, Alexander Immer, Betrand Charpentier, Stephan Günnemann, Vincent Fortuin
Neural network sparsification is a promising avenue to save computational time and memory costs, especially in an age where many successful AI models are becoming too large to na\"ively deploy on consumer hardware.
no code implementations • 1 Feb 2024 • Theodore Papamarkou, Maria Skoularidou, Konstantina Palla, Laurence Aitchison, Julyan Arbel, David Dunson, Maurizio Filippone, Vincent Fortuin, Philipp Hennig, Jose Miguel Hernandez Lobato, Aliaksandr Hubin, Alexander Immer, Theofanis Karaletsos, Mohammad Emtiyaz Khan, Agustinus Kristiadi, Yingzhen Li, Stephan Mandt, Christopher Nemeth, Michael A. Osborne, Tim G. J. Rudner, David Rügamer, Yee Whye Teh, Max Welling, Andrew Gordon Wilson, Ruqi Zhang
In the current landscape of deep learning research, there is a predominant emphasis on achieving high predictive accuracy in supervised tasks involving large image and language datasets.
no code implementations • 30 Nov 2023 • Alexander Möllers, Alexander Immer, Elvin Isufi, Vincent Fortuin
Graph contrastive learning has shown great promise when labeled data is scarce, but large unlabeled datasets are available.
no code implementations • NeurIPS 2023 • Runa Eschenhagen, Alexander Immer, Richard E. Turner, Frank Schneider, Philipp Hennig
In this work, we identify two different settings of linear weight-sharing layers which motivate two flavours of K-FAC -- $\textit{expand}$ and $\textit{reduce}$.
1 code implementation • 3 Oct 2023 • Alexandru Meterez, Amir Joudaki, Francesco Orabona, Alexander Immer, Gunnar Rätsch, Hadi Daneshmand
We answer this question in the affirmative by giving a particular construction of an Multi-Layer Perceptron (MLP) with linear activations and batch-normalization that provably has bounded gradients at any depth.
no code implementations • 14 Sep 2023 • Alexander Möllers, Alexander Immer, Vincent Fortuin, Elvin Isufi
We leverage this decomposition to develop a contrastive self-supervised learning approach for processing simplicial data and generating embeddings that encapsulate specific spectral information. Specifically, we encode the pertinent data invariances through simplicial neural networks and devise augmentations that yield positive contrastive examples with suitable spectral properties for downstream tasks.
1 code implementation • 6 Jun 2023 • Alexander Immer, Tycho F. A. van der Ouderaa, Mark van der Wilk, Gunnar Rätsch, Bernhard Schölkopf
Recent works show that Bayesian model selection with Laplace approximations can allow to optimize such hyperparameters just like standard neural network parameters using gradients and on the training data.
no code implementations • 26 May 2023 • Kouroche Bouchiat, Alexander Immer, Hugo Yèche, Gunnar Rätsch, Vincent Fortuin
Neural additive models (NAMs) enhance the transparency of deep neural networks by handling input features in separate additive sub-networks.
1 code implementation • 17 Apr 2023 • Agustinus Kristiadi, Alexander Immer, Runa Eschenhagen, Vincent Fortuin
The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks.
1 code implementation • 13 Oct 2022 • Alexander Immer, Christoph Schultheiss, Julia E. Vogt, Bernhard Schölkopf, Peter Bühlmann, Alexander Marx
We study the class of location-scale or heteroscedastic noise models (LSNMs), in which the effect $Y$ can be written as a function of the cause $X$ and a noise source $N$ independent of $X$, which may be scaled by a positive function $g$ over the cause, i. e., $Y = f(X) + g(X)N$.
1 code implementation • 22 Feb 2022 • Alexander Immer, Tycho F. A. van der Ouderaa, Gunnar Rätsch, Vincent Fortuin, Mark van der Wilk
We develop a convenient gradient-based method for selecting the data augmentation without validation data during training of a deep neural network.
1 code implementation • ACL 2022 • Alexander Immer, Lucas Torroba Hennigen, Vincent Fortuin, Ryan Cotterell
Such performance improvements have motivated researchers to quantify and understand the linguistic information encoded in these representations.
no code implementations • NeurIPS Workshop ICBINB 2021 • Tristan Cinquin, Alexander Immer, Max Horn, Vincent Fortuin
In recent years, the transformer has established itself as a workhorse in many applications ranging from natural language processing to reinforcement learning.
3 code implementations • NeurIPS 2021 • Erik Daxberger, Agustinus Kristiadi, Alexander Immer, Runa Eschenhagen, Matthias Bauer, Philipp Hennig
Bayesian formulations of deep learning have been shown to have compelling theoretical properties and offer practical functional benefits, such as improved predictive uncertainty quantification and model selection.
no code implementations • NeurIPS 2021 • Erik Daxberger, Agustinus Kristiadi, Alexander Immer, Runa Eschenhagen, Matthias Bauer, Philipp Hennig
Bayesian formulations of deep learning have been shown to have compelling theoretical properties and offer practical functional benefits, such as improved predictive uncertainty quantification and model selection.
1 code implementation • 11 Apr 2021 • Alexander Immer, Matthias Bauer, Vincent Fortuin, Gunnar Rätsch, Mohammad Emtiyaz Khan
Marginal-likelihood based model-selection, even though promising, is rarely used in deep learning due to estimation difficulties.
1 code implementation • 19 Aug 2020 • Alexander Immer, Maciej Korzepa, Matthias Bauer
The generalized Gauss-Newton (GGN) approximation is often used to make practical Bayesian deep learning approaches scalable by replacing a second order derivative with a product of first order derivatives.
no code implementations • 21 Jul 2020 • Alexander Immer
Algorithms that combine the Gauss-Newton method with the Laplace and Gaussian variational approximation have recently led to state-of-the-art results in Bayesian deep learning.
1 code implementation • NeurIPS 2020 • Pingbo Pan, Siddharth Swaroop, Alexander Immer, Runa Eschenhagen, Richard E. Turner, Mohammad Emtiyaz Khan
Continually learning new skills is important for intelligent systems, yet standard deep learning methods suffer from catastrophic forgetting of the past.
1 code implementation • 17 Jun 2019 • Alexander Immer, Guillaume P. Dehaene
The Black Box Variational Inference (Ranganath et al. (2014)) algorithm provides a universal method for Variational Inference, but taking advantage of special properties of the approximation family or of the target can improve the convergence speed significantly.
1 code implementation • NeurIPS 2019 • Mohammad Emtiyaz Khan, Alexander Immer, Ehsan Abedi, Maciej Korzepa
Deep neural networks (DNN) and Gaussian processes (GP) are two powerful models with several theoretical connections relating them, but the relationship between their training methods is not well understood.
no code implementations • 28 Nov 2017 • Danijar Hafner, Alexander Immer, Willi Raschkowski, Fabian Windheuser
Then, we capture a user's interest as a generative model in the space of the document representations.