no code implementations • 26 Apr 2024 • Benjamin Dupuis, Paul Viallard, George Deligiannidis, Umut Simsekli
We propose data-dependent uniform generalization bounds by approaching the problem from a PAC-Bayesian perspective.
no code implementations • 20 Feb 2024 • Fabian Schaipp, Guillaume Garrigos, Umut Simsekli, Robert Gower
We then derive iterative methods based on the stochastic proximal point method for computing the geometric median and generalizations thereof.
no code implementations • 13 Feb 2024 • Maxime Haddouche, Paul Viallard, Umut Simsekli, Benjamin Guedj
Modern machine learning usually involves predictors in the overparametrised setting (number of trained parameters greater than dataset size), and their training yield not only good performances on training data, but also good generalisation capacity.
no code implementations • 4 Jul 2023 • Sarah Sachs, Tim van Erven, Liam Hodgkinson, Rajiv Khanna, Umut Simsekli
Algorithm- and data-dependent generalization bounds are required to explain the generalization behavior of modern machine learning algorithms.
1 code implementation • 13 Jun 2023 • Yijun Wan, Melih Barsbey, Abdellatif Zaidi, Umut Simsekli
Neural network compression has been an increasingly important subject, not only due to its practical relevance, but also due to its theoretical implications, as there is an explicit connection between compressibility and generalization error.
no code implementations • 13 May 2022 • Mert Gurbuzbalaban, Yuanhan Hu, Umut Simsekli, Kun Yuan, Lingjiong Zhu
To have a more explicit control on the tail exponent, we then consider the case where the loss at each node is a quadratic, and show that the tail-index can be estimated as a function of the step-size, batch-size, and the topological properties of the network of the computational nodes.
no code implementations • NeurIPS 2020 • Valentin De Bortoli, Alain Durmus, Xavier Fontaine, Umut Simsekli
In comparison to previous works on the subject, we consider settings in which the sequence of stepsizes in SGD can potentially depend on the number of neurons and the iterations.
no code implementations • 28 Feb 2020 • Soheil Kolouri, Kimia Nadjahi, Umut Simsekli, Shahin Shahrampour
Probability metrics have become an indispensable part of modern statistics and machine learning, and they play a quintessential role in various applications, including statistical hypothesis testing and generative modeling.
no code implementations • 19 Oct 2019 • Alireza Fallah, Mert Gurbuzbalaban, Asuman Ozdaglar, Umut Simsekli, Lingjiong Zhu
When gradients do not contain noise, we also prove that distributed accelerated methods can \emph{achieve acceleration}, requiring $\mathcal{O}(\kappa \log(1/\varepsilon))$ gradient evaluations and $\mathcal{O}(\kappa \log(1/\varepsilon))$ communications to converge to the same fixed point with the non-accelerated variant where $\kappa$ is the condition number and $\varepsilon$ is the target accuracy.
1 code implementation • 11 Mar 2019 • Ali Taylan Cemgil, Mehmet Burak Kurutmaz, Sinan Yildirim, Melih Barsbey, Umut Simsekli
We introduce a dynamic generative model, Bayesian allocation model (BAM), which establishes explicit connections between nonnegative tensor factorization (NTF), graphical models of discrete probability distributions and their Bayesian extensions, and the topic models such as the latent Dirichlet allocation.
no code implementations • 8 Feb 2019 • Simon Leglaive, Umut Simsekli, Antoine Liutkus, Laurent Girin, Radu Horaud
This paper focuses on single-channel semi-supervised speech enhancement.
1 code implementation • NeurIPS 2019 • Soheil Kolouri, Kimia Nadjahi, Umut Simsekli, Roland Badeau, Gustavo K. Rohde
The SW distance, specifically, was shown to have similar properties to the Wasserstein distance, while being much simpler to compute, and is therefore used in various applications including generative modeling and general supervised/unsupervised learning.
1 code implementation • 18 Jan 2019 • Umut Simsekli, Levent Sagun, Mert Gurbuzbalaban
This assumption is often made for mathematical convenience, since it enables SGD to be analyzed as a stochastic differential equation (SDE) driven by a Brownian motion.
no code implementations • ICML 2018 • Umut Simsekli, Cagatay Yildiz, Than Huy Nguyen, Taylan Cemgil, Gael Richard
The results support our theory and show that the proposed algorithm provides a significant speedup over the recently proposed synchronous distributed L-BFGS algorithm.
no code implementations • NeurIPS 2016 • Alain Durmus, Umut Simsekli, Eric Moulines, Roland Badeau, Gaël Richard
We illustrate our framework on the popular Stochastic Gradient Langevin Dynamics (SGLD) algorithm and propose a novel SG-MCMC algorithm referred to as Stochastic Gradient Richardson-Romberg Langevin Dynamics (SGRRLD).
no code implementations • NeurIPS 2011 • Kenan Y. Yılmaz, Ali T. Cemgil, Umut Simsekli
We derive algorithms for generalised tensor factorisation (GTF) by building upon the well-established theory of Generalised Linear Models.