no code implementations • 11 Oct 2023 • Behrad Moniri, Donghwan Lee, Hamed Hassani, Edgar Dobriban
However, with a constant gradient descent step size, this spike only carries information from the linear component of the target function and therefore learning non-linear components is impossible.
1 code implementation • 31 Jan 2023 • Donghwan Lee, Behrad Moniri, Xinmeng Huang, Edgar Dobriban, Hamed Hassani
Evaluating the performance of machine learning models under distribution shift is challenging, especially when we only have unlabeled data from the shifted (target) domain, along with labeled data from the original (source) domain.
no code implementations • 15 Feb 2022 • Hassan Hafez-Kolahi, Behrad Moniri, Shohreh Kasaei
We consider the frequentist problem of minimax excess risk as a zero-sum game between the algorithm designer and the world.
no code implementations • 10 May 2021 • Hassan Hafez-Kolahi, Behrad Moniri, Shohreh Kasaei, Mahdieh Soleymani Baghshah
For the upper bound, the optimization is further constrained to use $R$ bits from the training set, a setting which relates MER to information-theoretic bounds on the generalization gap in frequentist learning.