no code implementations • 4 Feb 2024 • Arsalan SharifNassab, Saber Salehkaleybar, Richard Sutton
This paper addresses the challenge of optimizing meta-parameters (i. e., hyperparameters) in machine learning algorithms, a critical factor influencing training efficiency and model performance.
no code implementations • 30 Jan 2024 • Thomas Degris, Khurram Javed, Arsalan SharifNassab, Yuxin Liu, Richard Sutton
We conclude by suggesting that combining both approaches could be a promising future direction to improve the performance of neural networks in continual learning.
no code implementations • 31 Jan 2023 • Arsalan SharifNassab, Richard Sutton
Gradient-based methods for value estimation in reinforcement learning have favorable stability properties, but they are typically much slower than Temporal Difference (TD) learning methods.
no code implementations • 14 Nov 2021 • Arsalan SharifNassab, John N. Tsitsiklis
We say that a random variable is $light$-$tailed$ if moments of order $2+\epsilon$ are finite for some $\epsilon>0$; otherwise, we say that it is $heavy$-$tailed$.
no code implementations • 19 Aug 2021 • Arsalan SharifNassab, Saber Salehkaleybar, S. Jamaloddin Golestani
We then prove that this lower bound is order optimal in $m$ and $n$ by presenting a distributed learning algorithm, called Multi-Resolution Estimator for Non-Convex loss function (MRE-NC), whose expected loss matches the lower bound for large $mn$ up to polylogarithmic factors.
no code implementations • ICLR 2020 • Arsalan Sharifnassab, Saber Salehkaleybar, S. Jamaloddin Golestani
We show that there exist poor local minima with positive curvature for some training sets of size $n\geq m+2d-2$.
1 code implementation • NeurIPS 2019 • Arsalan Sharifnassab, Saber Salehkaleybar, S. Jamaloddin Golestani
We propose an algorithm called Multi-Resolution Estimator (MRE) whose expected error is no larger than $\tilde{O}\big(m^{-{1}/{\max(d, 2)}} n^{-1/2}\big)$, where $d$ is the dimension of the parameter space.