Search Results for author: Ahmed Khaled

Found 14 papers, 4 papers with code

Directional Smoothness and Gradient Methods: Convergence and Adaptivity

no code implementations • 6 Mar 2024 • Aaron Mishkin, Ahmed Khaled, Yuanhao Wang, Aaron Defazio, Robert M. Gower

We develop new sub-optimality bounds for gradient descent (GD) that depend on the conditioning of the objective along the path of optimization, rather than on global, worst-case constants.

Paper
Add Code

Tuning-Free Stochastic Optimization

no code implementations • 12 Feb 2024 • Ahmed Khaled, Chi Jin

For the task of finding a stationary point of a smooth and potentially nonconvex function, we give a variant of SGD that matches the best-known high-probability convergence rate for tuned SGD at only an additional polylogarithmic cost.

Stochastic Optimization

Paper
Add Code

DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent Method

1 code implementation • NeurIPS 2023 • Ahmed Khaled, Konstantin Mishchenko, Chi Jin

This paper proposes a new easy-to-implement parameter-free gradient-based optimizer: DoWG (Distance over Weighted Gradients).

Paper
Code

Faster federated optimization under second-order similarity

no code implementations • 6 Sep 2022 • Ahmed Khaled, Chi Jin

Federated learning (FL) is a subfield of machine learning where multiple clients try to collaboratively learn a model over a network under communication constraints.

Federated Learning

Paper
Add Code

Federated Optimization Algorithms with Random Reshuffling and Gradient Compression

1 code implementation • 14 Jun 2022 • Abdurakhmon Sadiev, Grigory Malinovsky, Eduard Gorbunov, Igor Sokolov, Ahmed Khaled, Konstantin Burlachenko, Peter Richtárik

To reveal the true advantages of RR in the distributed learning with compression, we propose a new method called DIANA-RR that reduces the compression variance and has provably better convergence rates than existing counterparts with with-replacement sampling of stochastic gradients.

Federated Learning Quantization

Paper
Code

FLIX: A Simple and Communication-Efficient Alternative to Local Methods in Federated Learning

no code implementations • 22 Nov 2021 • Elnur Gasanov, Ahmed Khaled, Samuel Horváth, Peter Richtárik

A persistent problem in federated learning is that it is not clear what the optimization objective should be: the standard average risk minimization of supervised learning is inadequate in handling several major constraints specific to federated learning, such as communication adaptivity and personalization control.

Distributed Optimization Federated Learning

Paper
Add Code

Proximal and Federated Random Reshuffling

1 code implementation • NeurIPS 2021 • Konstantin Mishchenko, Ahmed Khaled, Peter Richtárik

Random Reshuffling (RR), also known as Stochastic Gradient Descent (SGD) without replacement, is a popular and theoretically grounded method for finite-sum minimization.

Paper
Code

Unified Analysis of Stochastic Gradient Methods for Composite Convex and Smooth Optimization

no code implementations • 20 Jun 2020 • Ahmed Khaled, Othmane Sebbouh, Nicolas Loizou, Robert M. Gower, Peter Richtárik

We showcase this by obtaining a simple formula for the optimal minibatch size of two variance reduced methods (\textit{L-SVRG} and \textit{SAGA}).

Quantization

Paper
Add Code

Random Reshuffling: Simple Analysis with Vast Improvements

1 code implementation • NeurIPS 2020 • Konstantin Mishchenko, Ahmed Khaled, Peter Richtárik

from $\kappa$ to $\sqrt{\kappa}$) and, in addition, show that RR has a different type of variance.

Paper
Code

Better Theory for SGD in the Nonconvex World

no code implementations • 9 Feb 2020 • Ahmed Khaled, Peter Richtárik

Moreover, we perform our analysis in a framework which allows for a detailed study of the effects of a wide array of sampling strategies and minibatch sizes for finite-sum optimization problems.

Paper
Add Code

Distributed Fixed Point Methods with Compressed Iterates

no code implementations • 20 Dec 2019 • Sélim Chraibi, Ahmed Khaled, Dmitry Kovalev, Peter Richtárik, Adil Salim, Martin Takáč

We propose basic and natural assumptions under which iterative optimization methods with compressed iterates can be analyzed.

Federated Learning

Paper
Add Code

Gradient Descent with Compressed Iterates

no code implementations • 10 Sep 2019 • Ahmed Khaled, Peter Richtárik

We propose and analyze a new type of stochastic first order method: gradient descent with compressed iterates (GDCI).

Federated Learning

Paper
Add Code

First Analysis of Local GD on Heterogeneous Data

no code implementations • 10 Sep 2019 • Ahmed Khaled, Konstantin Mishchenko, Peter Richtárik

We provide the first convergence analysis of local gradient descent for minimizing the average of smooth and convex but otherwise arbitrary functions.

Federated Learning

Paper
Add Code

Tighter Theory for Local SGD on Identical and Heterogeneous Data

no code implementations • 10 Sep 2019 • Ahmed Khaled, Konstantin Mishchenko, Peter Richtárik

We provide a new analysis of local SGD, removing unnecessary assumptions and elaborating on the difference between two data regimes: identical and heterogeneous.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.