2 code implementations • 4 Apr 2024 • Andrei Semenov, Vladimir Ivanov, Aleksandr Beznosikov, Alexander Gasnikov
We propose a novel architecture and method of explainable classification with Concept Bottleneck Models (CBMs).
Ranked #1 on Image Classification on CUB-200-2011
no code implementations • 19 Mar 2024 • Nikita Kornilov, Alexander Gasnikov, Alexander Korotin
Over the several recent years, there has been a boom in development of flow matching methods for generative modeling.
no code implementations • 7 Feb 2024 • Petr Ostroukhov, Aigerim Zhumabayeva, Chulu Xiang, Alexander Gasnikov, Martin Takáč, Dmitry Kamzolov
To substantiate the efficacy of our method, we experimentally show, how the introduction of adaptive step size and adaptive batch size gradually improves the performance of regular SGD.
no code implementations • 15 Jan 2024 • Daniil Medyakov, Gleb Molodtsov, Aleksandr Beznosikov, Alexander Gasnikov
Therefore, a large amount of research has recently been directed at solving this problem.
1 code implementation • 15 Jan 2024 • Mikhail Rudakov, Aleksandr Beznosikov, Yaroslav Kholodov, Alexander Gasnikov
We analyze compression methods such as quantization and TopK compression, and also experiment with error compensation techniques.
1 code implementation • 7 Nov 2023 • Nikita Puchkin, Eduard Gorbunov, Nikolay Kutuzov, Alexander Gasnikov
We consider stochastic optimization problems with heavy-tailed noise with structured density.
no code implementations • 3 Oct 2023 • Eduard Gorbunov, Abdurakhmon Sadiev, Marina Danilova, Samuel Horváth, Gauthier Gidel, Pavel Dvurechensky, Alexander Gasnikov, Peter Richtárik
High-probability analysis of stochastic first-order optimization methods under mild assumptions on the noise has been gaining a lot of attention in recent years.
no code implementations • NeurIPS 2023 • Aleksandr Beznosikov, Sergey Samsonov, Marina Sheshukova, Alexander Gasnikov, Alexey Naumov, Eric Moulines
We present a unified approach for the theoretical analysis of first-order gradient methods for stochastic optimization and variational inequalities.
1 code implementation • 11 May 2023 • Yuriy Dorn, Nikita Kornilov, Nikolay Kutuzov, Alexander Nazin, Eduard Gorbunov, Alexander Gasnikov
We establish convergence results under mild assumptions on the rewards distribution and demonstrate that INF-clip is optimal for linear heavy-tailed stochastic MAB problems and works well for non-linear ones.
no code implementations • NeurIPS 2023 • Aleksandr Beznosikov, Martin Takáč, Alexander Gasnikov
The methods presented in this paper have the best theoretical guarantees of communication complexity and are significantly ahead of other methods for distributed variational inequalities.
no code implementations • 2 Feb 2023 • Abdurakhmon Sadiev, Marina Danilova, Eduard Gorbunov, Samuel Horváth, Gauthier Gidel, Pavel Dvurechensky, Alexander Gasnikov, Peter Richtárik
During recent years the interest of optimization and machine learning communities in high-probability convergence of stochastic optimization methods has been growing.
no code implementations • 29 Dec 2022 • Alexander Gasnikov, Dmitry Kovalev, Grigory Malinovsky
In this paper we study the smooth strongly convex minimization problem $\min_{x}\min_y f(x, y)$.
no code implementations • 12 Oct 2022 • Aleksandr Beznosikov, Alexander Gasnikov
In this paper we consider the problem of stochastic finite-sum cocoercive variational inequalities.
no code implementations • 29 Aug 2022 • Aleksandr Beznosikov, Boris Polyak, Eduard Gorbunov, Dmitry Kovalev, Alexander Gasnikov
This paper is a survey of methods for solving smooth (strongly) monotone stochastic variational inequalities.
no code implementations • 19 Jun 2022 • Aleksandr Beznosikov, Alexander Gasnikov
Variational inequalities are an important tool, which includes minimization, saddles, games, fixed-point problems.
no code implementations • 16 Jun 2022 • Aleksandr Beznosikov, Aibek Alanov, Dmitry Kovalev, Martin Takáč, Alexander Gasnikov
Methods with adaptive scaling of different features play a key role in solving saddle point problems, primarily due to Adam's popularity for solving adversarial machine learning problems, including GANS training.
no code implementations • 3 Jun 2022 • Egor Gladin, Maksim Lavrik-Karmazin, Karina Zainullina, Varvara Rudenko, Alexander Gasnikov, Martin Takáč
The problem of constrained Markov decision process is considered.
1 code implementation • 2 Jun 2022 • Eduard Gorbunov, Marina Danilova, David Dobre, Pavel Dvurechensky, Alexander Gasnikov, Gauthier Gidel
In this work, we prove the first high-probability complexity results with logarithmic dependence on the confidence level for stochastic methods for solving monotone and structured non-monotone VIPs with non-sub-Gaussian (heavy-tailed) noise and unbounded domains.
no code implementations • 30 May 2022 • Dmitry Kovalev, Aleksandr Beznosikov, Ekaterina Borodich, Alexander Gasnikov, Gesualdo Scutari
Finally the method is extended to distributed saddle-problems (under function similarity) by means of solving a class of variational inequalities, achieving lower communication and computation complexity bounds.
1 code implementation • 19 May 2022 • Dmitry Kovalev, Alexander Gasnikov
Arjevani et al. (2019) established the lower bound $\Omega\left(\epsilon^{-2/(3p+1)}\right)$ on the number of the $p$-th order oracle calls required by an algorithm to find an $\epsilon$-accurate solution to the problem, where the $p$-th order oracle stands for the computation of the objective function value and the derivatives up to the order $p$.
no code implementations • 11 May 2022 • Dmitry Kovalev, Alexander Gasnikov
However, the existing state-of-the-art methods do not match this lower bound: algorithms of Lin et al. (2020) and Wang and Li (2020) have gradient evaluation complexity $\mathcal{O}\left( \sqrt{\kappa_x\kappa_y}\log^3\frac{1}{\epsilon}\right)$ and $\mathcal{O}\left( \sqrt{\kappa_x\kappa_y}\log^3 (\kappa_x\kappa_y)\log\frac{1}{\epsilon}\right)$, respectively.
no code implementations • 6 Feb 2022 • Dmitry Kovalev, Aleksandr Beznosikov, Abdurakhmon Sadiev, Michael Persiianov, Peter Richtárik, Alexander Gasnikov
Our algorithms are the best among the available literature not only in the decentralized stochastic case, but also in the decentralized deterministic and non-distributed stochastic cases.
no code implementations • 30 Dec 2021 • Dmitry Kovalev, Alexander Gasnikov, Peter Richtárik
In this paper we study the convex-concave saddle-point problem $\min_x \max_y f(x) + y^T \mathbf{A} x - g(y)$, where $f(x)$ and $g(y)$ are smooth and convex functions.
no code implementations • NeurIPS 2021 • Dmitry Kovalev, Elnur Gasanov, Alexander Gasnikov, Peter Richtarik
We consider the task of minimizing the sum of smooth and strongly convex functions stored in a decentralized manner across the nodes of a communication network whose links are allowed to change in time.
no code implementations • NeurIPS 2021 • Aleksandr Beznosikov, Gesualdo Scutari, Alexander Rogozin, Alexander Gasnikov
We study solution methods for (strongly-)convex-(strongly)-concave Saddle-Point Problems (SPPs) over networks of two type--master/workers (thus centralized) architectures and mesh (thus decentralized) networks.
no code implementations • 24 Oct 2021 • Ye Tian, Gesualdo Scutari, Tianyu Cao, Alexander Gasnikov
In order to reduce the number of communications to reach a solution accuracy, we proposed a {\it preconditioned, accelerated} distributed method.
no code implementations • 7 Oct 2021 • Aleksandr Beznosikov, Peter Richtárik, Michael Diskin, Max Ryabinin, Alexander Gasnikov
Due to these considerations, it is important to equip existing methods with strategies that would allow to reduce the volume of transmitted information during training while obtaining a model of comparable quality.
1 code implementation • 22 Jul 2021 • Aleksandr Beznosikov, Gesualdo Scutari, Alexander Rogozin, Alexander Gasnikov
We study solution methods for (strongly-)convex-(strongly)-concave Saddle-Point Problems (SPPs) over networks of two type - master/workers (thus centralized) architectures and meshed (thus decentralized) networks.
no code implementations • 15 Jun 2021 • Aleksandr Beznosikov, Pavel Dvurechensky, Anastasia Koloskova, Valentin Samokhin, Sebastian U Stich, Alexander Gasnikov
We extend the stochastic extragradient method to this very general setting and theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone (when a Minty solution exists) settings.
no code implementations • 14 Jun 2021 • Ekaterina Borodich, Aleksandr Beznosikov, Abdurakhmon Sadiev, Vadim Sushko, Nikolay Savelyev, Martin Takáč, Alexander Gasnikov
Personalized Federated Learning (PFL) has witnessed remarkable advancements, enabling the development of innovative machine learning applications that preserve the privacy of training data.
1 code implementation • 10 Jun 2021 • Eduard Gorbunov, Marina Danilova, Innokentiy Shibaev, Pavel Dvurechensky, Alexander Gasnikov
In our paper, we resolve this issue and derive the first high-probability convergence results with logarithmic dependence on the confidence level for non-smooth convex stochastic optimization problems with non-sub-Gaussian (heavy-tailed) noise.
no code implementations • NeurIPS 2021 • Eduard Gorbunov, Marina Danilova, Innokentiy Andreevich Shibaev, Pavel Dvurechensky, Alexander Gasnikov
In our paper, we resolve this issue and derive the first high-probability convergence results with logarithmical dependence on the confidence level for non-smooth convex stochastic optimization problems with non-sub-Gaussian (heavy-tailed) noise.
no code implementations • 27 Feb 2021 • Daniil Tiapkin, Alexander Gasnikov
We consider the problem of learning the optimal policy for infinite-horizon Markov decision processes (MDPs).
no code implementations • 18 Feb 2021 • Dmitry Kovalev, Egor Shulgin, Peter Richtárik, Alexander Rogozin, Alexander Gasnikov
We propose ADOM - an accelerated method for smooth and strongly convex decentralized optimization over time-varying networks.
no code implementations • 16 Feb 2021 • Pavel Dvurechensky, Dmitry Kamzolov, Aleksandr Lukashevich, Soomin Lee, Erik Ordentlich, César A. Uribe, Alexander Gasnikov
Statistical preconditioning enables fast methods for distributed large-scale empirical risk minimization problems.
Distributed Optimization Optimization and Control
no code implementations • 15 Feb 2021 • Alexander Rogozin, Alexander Beznosikov, Darina Dvinskikh, Dmitry Kovalev, Pavel Dvurechensky, Alexander Gasnikov
We consider distributed convex-concave saddle point problems over arbitrary connected undirected networks and propose a decentralized distributed algorithm for their solution.
Distributed Optimization Optimization and Control Distributed, Parallel, and Cluster Computing
no code implementations • 1 Feb 2021 • Nikita Yudin, Alexander Gasnikov
This work presents a novel version of recently developed Gauss-Newton method for solving systems of nonlinear equations, based on upper bound of solution residual and quadratic regularization ideas.
Stochastic Optimization Optimization and Control
no code implementations • 11 Jan 2021 • Vasilii Novitskii, Alexander Gasnikov
We consider $\beta$-smooth (satisfies the generalized Holder condition with parameter $\beta > 2$) stochastic convex optimization problem with zero-order one-point oracle.
Optimization and Control
no code implementations • 31 Dec 2020 • Petr Ostroukhov, Rinat Kamalov, Pavel Dvurechensky, Alexander Gasnikov
The first method is based on the assumption of $p$-th order smoothness of the objective and it achieves a convergence rate of $O \left( \left( \frac{L_p R^{p - 1}}{\mu} \right)^\frac{2}{p + 1} \log \frac{\mu R^2}{\varepsilon_G} \right)$, where $R$ is an estimate of the initial distance to the solution, and $\varepsilon_G$ is the error in terms of duality gap.
Optimization and Control
no code implementations • 31 Dec 2020 • Artem Agafonov, Dmitry Kamzolov, Pavel Dvurechensky, Alexander Gasnikov
We propose general non-accelerated and accelerated tensor methods under inexact information on the derivatives of the objective, analyze their convergence rate.
Optimization and Control
no code implementations • 11 Dec 2020 • Marina Danilova, Pavel Dvurechensky, Alexander Gasnikov, Eduard Gorbunov, Sergey Guminov, Dmitry Kamzolov, Innokentiy Shibaev
For this setting, we first present known results for the convergence rates of deterministic first-order methods, which are then followed by a general theoretical analysis of optimal stochastic and randomized gradient schemes, and an overview of the stochastic first-order methods.
no code implementations • 8 Dec 2020 • Ekaterina Kotliarova, Alexander Gasnikov, Evgenia Gasnikova
The first block consists of model for calculating correspondence (demand) matrix, whereas the second block is a traffic assignment model.
Optimization and Control
no code implementations • 25 Oct 2020 • Aleksandr Beznosikov, Valentin Samokhin, Alexander Gasnikov
This paper focuses on the distributed optimization of stochastic saddle point problems.
no code implementations • 21 Sep 2020 • Abdurakhmon Sadiev, Aleksandr Beznosikov, Pavel Dvurechensky, Alexander Gasnikov
In particular, our analysis shows that in the case when the feasible set is a direct product of two simplices, our convergence rate for the stochastic term is only by a $\log n$ factor worse than for the first-order methods.
1 code implementation • 6 Aug 2020 • Meruza Kubentayeva, Alexander Gasnikov
In this paper we consider the application of several gradient methods to the traffic assignment problem: we search equilibria in the stable dynamics model (Nesterov and De Palma, 2003) and the Beckmann model.
Optimization and Control
no code implementations • 11 Jun 2020 • Daniil Tiapkin, Alexander Gasnikov, Pavel Dvurechensky
This leads to a complicated stochastic optimization problem where the objective is given as an expectation of a function given as a solution to a random optimization problem.
1 code implementation • NeurIPS 2020 • Eduard Gorbunov, Marina Danilova, Alexander Gasnikov
In this paper, we propose a new accelerated stochastic first-order method called clipped-SSTM for smooth convex stochastic optimization with heavy-tailed distributed noise in stochastic gradients and derive the first high-probability complexity bounds for this method closing the gap in the theory of stochastic optimization with heavy-tailed noise.
no code implementations • 12 May 2020 • Aleksandr Beznosikov, Abdurakhmon Sadiev, Alexander Gasnikov
In the second part of the paper, we analyze the case when such an assumption cannot be made, we propose a general approach on how to modernize the method to solve this problem, and also we apply this approach to particular cases of some classical sets.
2 code implementations • 18 Apr 2020 • Darina Dvinskikh, Dmitry Kamzolov, Alexander Gasnikov, Pavel Dvurechensky, Dmitry Pasechnyk, Vladislav Matykhin, Alexei Chernov
We propose an accelerated meta-algorithm, which allows to obtain accelerated methods for convex unconstrained minimization in different settings.
Optimization and Control
1 code implementation • 25 Nov 2019 • Anastasiya Ivanova, Dmitry Pasechnyuk, Dmitry Grishchenko, Egor Shulgin, Alexander Gasnikov, Vladislav Matyukhin
In this paper, we present a generic framework that allows accelerating almost arbitrary non-accelerated deterministic and randomized algorithms for smooth convex optimization problems.
Optimization and Control
1 code implementation • 19 Nov 2019 • Aleksandr Ogaltsov, Darina Dvinskikh, Pavel Dvurechensky, Alexander Gasnikov, Vladimir Spokoiny
In this paper we propose several adaptive gradient methods for stochastic optimization.
Optimization and Control
no code implementations • 9 Jun 2019 • Sergey Guminov, Pavel Dvurechensky, Nazarii Tupitsa, Alexander Gasnikov
In this paper we combine AM and Nesterov's acceleration to propose an accelerated alternating minimization algorithm.
no code implementations • 3 Sep 2018 • César A. Uribe, Soomin Lee, Alexander Gasnikov, Angelia Nedić
Then, we study distributed optimization algorithms for non-dual friendly functions, as well as a method to improve the dependency on the parameters of the functions involved.
1 code implementation • 24 Jun 2018 • Anastasiya Ivanova, Alexander Gasnikov, Evgeni Nurminski, Evgeniya Vorontsova
We consider the resource allocation problem and its numerical solution.
Optimization and Control 90B99
2 code implementations • 10 Apr 2018 • Eduard Gorbunov, Evgeniya Vorontsova, Alexander Gasnikov
We considered the problem of obtaining upper bounds for the mathematical expectation of the $q$-norm ($2\leqslant q \leqslant \infty$) of the vector which is uniformly distributed on the unit Euclidean sphere.
Optimization and Control
no code implementations • 8 Mar 2018 • César A. Uribe, Darina Dvinskikh, Pavel Dvurechensky, Alexander Gasnikov, Angelia Nedić
We propose a new \cu{class-optimal} algorithm for the distributed computation of Wasserstein Barycenters over networks.
1 code implementation • 25 Feb 2018 • Eduard Gorbunov, Pavel Dvurechensky, Alexander Gasnikov
In the two-point feedback setting, i. e. when pairs of function values are available, we propose an accelerated derivative-free algorithm together with its complexity analysis.
Optimization and Control Computational Complexity
1 code implementation • ICML 2018 • Pavel Dvurechensky, Alexander Gasnikov, Alexey Kroshnin
We analyze two algorithms for approximating the general optimal transport (OT) distance between two discrete distributions of size $n$, up to accuracy $\varepsilon$.
Data Structures and Algorithms Optimization and Control
no code implementations • 1 Dec 2017 • César A. Uribe, Soomin Lee, Alexander Gasnikov, Angelia Nedić
In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks.
2 code implementations • 30 Sep 2017 • Evgeniya Vorontsova, Alexander Gasnikov, Eduard Gorbunov
In the paper we show how to make Nesterov's method $n$-times faster (up to a $\log n$-factor) in this case.
Optimization and Control
no code implementations • NeurIPS 2016 • Lev Bogolubsky, Pavel Dvurechenskii, Alexander Gasnikov, Gleb Gusev, Yurii Nesterov, Andrei M. Raigorodskii, Aleksey Tikhonov, Maksim Zhukovskii
In this paper, we consider a non-convex loss-minimization problem of learning Supervised PageRank models, which can account for features of nodes and edges.