no code implementations • 29 Aug 2023 • Hadrien Hendrikx, Paul Mangold, Aurélien Bellet
Leveraging this assumption, we introduce the Relative Gaussian Mechanism (RGM), in which the variance of the noise depends on the norm of the output.
no code implementations • 2 May 2023 • Anastasia Koloskova, Hadrien Hendrikx, Sebastian U. Stich
In particular, we show that (i) for deterministic gradient descent, the clipping threshold only affects the higher-order terms of convergence, (ii) in the stochastic setting convergence to the true optimum cannot be guaranteed under the standard noise assumption, even under arbitrary small step-sizes.
1 code implementation • 5 Jan 2023 • Thijs Vogels, Hadrien Hendrikx, Martin Jaggi
This paper aims to paint an accurate picture of sparsely-connected distributed optimization.
1 code implementation • 7 Jun 2022 • Thijs Vogels, Hadrien Hendrikx, Martin Jaggi
In data-parallel optimization of machine learning models, workers collaborate to improve their estimates of the model: more accurate gradients allow them to use larger learning rates and optimize faster.
no code implementations • NeurIPS 2021 • Mathieu Even, Raphaël Berthier, Francis Bach, Nicolas Flammarion, Hadrien Hendrikx, Pierre Gaillard, Laurent Massoulié, Adrien Taylor
We introduce the ``continuized'' Nesterov acceleration, a close variant of Nesterov acceleration whose variables are indexed by a continuous time parameter.
1 code implementation • 10 Jun 2021 • Mathieu Even, Raphaël Berthier, Francis Bach, Nicolas Flammarion, Pierre Gaillard, Hadrien Hendrikx, Laurent Massoulié, Adrien Taylor
We introduce the continuized Nesterov acceleration, a close variant of Nesterov acceleration whose variables are indexed by a continuous time parameter.
no code implementations • 7 Jun 2021 • Mathieu Even, Hadrien Hendrikx, Laurent Massoulie
Our approach yields a precise characterization of convergence time and of its dependency on heterogeneous delays in the network.
no code implementations • 28 Jan 2019 • Hadrien Hendrikx, Francis Bach, Laurent Massoulié
In this work, we study the problem of minimizing the sum of strongly convex functions split over a network of $n$ nodes.
Optimization and Control Distributed, Parallel, and Cluster Computing
no code implementations • 5 Oct 2018 • Hadrien Hendrikx, Francis Bach, Laurent Massoulié
Applying $ESDACD$ to quadratic local functions leads to an accelerated randomized gossip algorithm of rate $O( \sqrt{\theta_{\rm gossip}/n})$ where $\theta_{\rm gossip}$ is the rate of the standard randomized gossip.
no code implementations • NeurIPS 2017 • El Mahdi El Mhamdi, Rachid Guerraoui, Hadrien Hendrikx, Alexandre Maurer
We give realistic sufficient conditions on the learning algorithm to enable dynamic safe interruptibility in the case of joint action learners, yet show that these conditions are not sufficient for independent learners.
Multi-agent Reinforcement Learning reinforcement-learning +1