Search Results for author: Ahmed Khaled

Found 14 papers, 4 papers with code

Directional Smoothness and Gradient Methods: Convergence and Adaptivity

no code implementations6 Mar 2024 Aaron Mishkin, Ahmed Khaled, Yuanhao Wang, Aaron Defazio, Robert M. Gower

We develop new sub-optimality bounds for gradient descent (GD) that depend on the conditioning of the objective along the path of optimization, rather than on global, worst-case constants.

Tuning-Free Stochastic Optimization

no code implementations12 Feb 2024 Ahmed Khaled, Chi Jin

For the task of finding a stationary point of a smooth and potentially nonconvex function, we give a variant of SGD that matches the best-known high-probability convergence rate for tuned SGD at only an additional polylogarithmic cost.

Stochastic Optimization

DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent Method

1 code implementation NeurIPS 2023 Ahmed Khaled, Konstantin Mishchenko, Chi Jin

This paper proposes a new easy-to-implement parameter-free gradient-based optimizer: DoWG (Distance over Weighted Gradients).

Faster federated optimization under second-order similarity

no code implementations6 Sep 2022 Ahmed Khaled, Chi Jin

Federated learning (FL) is a subfield of machine learning where multiple clients try to collaboratively learn a model over a network under communication constraints.

Federated Learning

Federated Optimization Algorithms with Random Reshuffling and Gradient Compression

1 code implementation14 Jun 2022 Abdurakhmon Sadiev, Grigory Malinovsky, Eduard Gorbunov, Igor Sokolov, Ahmed Khaled, Konstantin Burlachenko, Peter Richtárik

To reveal the true advantages of RR in the distributed learning with compression, we propose a new method called DIANA-RR that reduces the compression variance and has provably better convergence rates than existing counterparts with with-replacement sampling of stochastic gradients.

Federated Learning Quantization

FLIX: A Simple and Communication-Efficient Alternative to Local Methods in Federated Learning

no code implementations22 Nov 2021 Elnur Gasanov, Ahmed Khaled, Samuel Horváth, Peter Richtárik

A persistent problem in federated learning is that it is not clear what the optimization objective should be: the standard average risk minimization of supervised learning is inadequate in handling several major constraints specific to federated learning, such as communication adaptivity and personalization control.

Distributed Optimization Federated Learning

Proximal and Federated Random Reshuffling

1 code implementation NeurIPS 2021 Konstantin Mishchenko, Ahmed Khaled, Peter Richtárik

Random Reshuffling (RR), also known as Stochastic Gradient Descent (SGD) without replacement, is a popular and theoretically grounded method for finite-sum minimization.

Unified Analysis of Stochastic Gradient Methods for Composite Convex and Smooth Optimization

no code implementations20 Jun 2020 Ahmed Khaled, Othmane Sebbouh, Nicolas Loizou, Robert M. Gower, Peter Richtárik

We showcase this by obtaining a simple formula for the optimal minibatch size of two variance reduced methods (\textit{L-SVRG} and \textit{SAGA}).

Quantization

Random Reshuffling: Simple Analysis with Vast Improvements

1 code implementation NeurIPS 2020 Konstantin Mishchenko, Ahmed Khaled, Peter Richtárik

from $\kappa$ to $\sqrt{\kappa}$) and, in addition, show that RR has a different type of variance.

Better Theory for SGD in the Nonconvex World

no code implementations9 Feb 2020 Ahmed Khaled, Peter Richtárik

Moreover, we perform our analysis in a framework which allows for a detailed study of the effects of a wide array of sampling strategies and minibatch sizes for finite-sum optimization problems.

Distributed Fixed Point Methods with Compressed Iterates

no code implementations20 Dec 2019 Sélim Chraibi, Ahmed Khaled, Dmitry Kovalev, Peter Richtárik, Adil Salim, Martin Takáč

We propose basic and natural assumptions under which iterative optimization methods with compressed iterates can be analyzed.

Federated Learning

Gradient Descent with Compressed Iterates

no code implementations10 Sep 2019 Ahmed Khaled, Peter Richtárik

We propose and analyze a new type of stochastic first order method: gradient descent with compressed iterates (GDCI).

Federated Learning

First Analysis of Local GD on Heterogeneous Data

no code implementations10 Sep 2019 Ahmed Khaled, Konstantin Mishchenko, Peter Richtárik

We provide the first convergence analysis of local gradient descent for minimizing the average of smooth and convex but otherwise arbitrary functions.

Federated Learning

Tighter Theory for Local SGD on Identical and Heterogeneous Data

no code implementations10 Sep 2019 Ahmed Khaled, Konstantin Mishchenko, Peter Richtárik

We provide a new analysis of local SGD, removing unnecessary assumptions and elaborating on the difference between two data regimes: identical and heterogeneous.

Cannot find the paper you are looking for? You can Submit a new open access paper.