no code implementations • 26 Dec 2021 • Sultan J. Majeed
The field of artificial intelligence (AI) is devoted to the creation of artificial decision-makers that can perform (at least) on par with the human counterparts on a domain of interest.
no code implementations • 26 Dec 2021 • Sultan J. Majeed, Marcus Hutter
A distinguishing feature of ESA is that it proves an upper bound of $O\left(\varepsilon^{-A} \cdot (1-\gamma)^{-2A}\right)$ on the number of states required for the surrogate MDP (where $A$ is the number of actions, $\gamma$ is the discount-factor, and $\varepsilon$ is the optimality-gap) which holds \emph{uniformly} for \emph{all} domains.
no code implementations • 28 May 2019 • Marcus Hutter, Samuel Yang-Zhao, Sultan J. Majeed
The convergence of many reinforcement learning (RL) algorithms with linear function approximation has been investigated extensively but most proofs assume that these methods converge to a unique solution.