Search Results for author: Zhiyou Yang

Found 1 papers, 0 papers with code

Double Thompson Sampling in Finite stochastic Games

no code implementations21 Feb 2022 Shuqing Shi, Xiaobin Wang, Zhiyou Yang, Fan Zhang, Hong Qu

This algorithm achieves a total regret bound of $\tilde{\mathcal{O}}(D\sqrt{SAT})$in time horizon $T$ with $S$ states, $A$ actions and diameter $D$.

Thompson Sampling

Cannot find the paper you are looking for? You can Submit a new open access paper.