1 code implementation • 9 Feb 2024 • Tara Akhound-Sadegh, Jarrid Rector-Brooks, Avishek Joey Bose, Sarthak Mittal, Pablo Lemos, Cheng-Hao Liu, Marcin Sendera, Siamak Ravanbakhsh, Gauthier Gidel, Yoshua Bengio, Nikolay Malkin, Alexander Tong
Efficiently generating statistically independent samples from an unnormalized probability distribution, such as equilibrium samples of many-body systems, is a foundational problem in science.
1 code implementation • 7 Feb 2024 • Marcin Sendera, Minsu Kim, Sarthak Mittal, Pablo Lemos, Luca Scimeca, Jarrid Rector-Brooks, Alexandre Adam, Yoshua Bengio, Nikolay Malkin
We study the problem of training diffusion models to sample from a distribution with a given unnormalized density or energy function.
no code implementations • 7 May 2023 • Sarthak Mittal, Oleksii Hrinchuk, Oleksii Kuchaiev
In this work, we provide a recipe for training machine translation models in a limited resource setting by leveraging synthetic target data generated using a large pre-trained model.
1 code implementation • 27 Dec 2022 • Yingtian Zou, Vikas Verma, Sarthak Mittal, Wai Hoh Tang, Hieu Pham, Juho Kannala, Yoshua Bengio, Arno Solin, Kenji Kawaguchi
Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpolating pairs of inputs and their labels.
1 code implementation • 25 Oct 2022 • Sarthak Mittal, Guillaume Lajoie, Stefan Bauer, Arash Mehrjou
Consequently, it is reasonable to ask if there is an intermediate time step at which the preserved information is optimal for a given downstream task.
1 code implementation • 9 Jun 2022 • Giancarlo Kerg, Sarthak Mittal, David Rolnick, Yoshua Bengio, Blake Richards, Guillaume Lajoie
Recent work has explored how forcing relational representations to remain distinct from sensory representations, as it seems to be the case in the brain, can help artificial systems.
1 code implementation • 6 Jun 2022 • Sarthak Mittal, Yoshua Bengio, Guillaume Lajoie
Inspired from human cognition, machine learning systems are gradually revealing advantages of sparser and more modular architectures.
3 code implementations • ICLR 2022 • Sarthak Mittal, Sharath Chandra Raparthy, Irina Rish, Yoshua Bengio, Guillaume Lajoie
Through our qualitative analysis, we demonstrate that Compositional Attention leads to dynamic specialization based on the type of retrieval needed.
1 code implementation • 2 Jul 2021 • Nan Rosemary Ke, Aniket Didolkar, Sarthak Mittal, Anirudh Goyal, Guillaume Lajoie, Stefan Bauer, Danilo Rezende, Yoshua Bengio, Michael Mozer, Christopher Pal
A central goal for AI and causality is thus the joint discovery of abstract representations and causal structure.
no code implementations • 29 May 2021 • Korbinian Abstreiter, Sarthak Mittal, Stefan Bauer, Bernhard Schölkopf, Arash Mehrjou
In contrast, the introduced diffusion-based representation learning relies on a new formulation of the denoising score matching objective and thus encodes the information needed for denoising.
1 code implementation • ICML 2020 • Sarthak Mittal, Alex Lamb, Anirudh Goyal, Vikram Voleti, Murray Shanahan, Guillaume Lajoie, Michael Mozer, Yoshua Bengio
To effectively utilize the wealth of potential top-down information available, and to prevent the cacophony of intermixed signals in a bidirectional architecture, mechanisms are needed to restrict information flow.
no code implementations • 19 Oct 2018 • Brady Neal, Sarthak Mittal, Aristide Baratin, Vinayak Tantia, Matthew Scicluna, Simon Lacoste-Julien, Ioannis Mitliagkas
The bias-variance tradeoff tells us that as model complexity increases, bias falls and variances increases, leading to a U-shaped test error curve.