no code implementations • 4 Dec 2018 • Haoran Wang, Thaleia Zariphopoulou, Xunyu Zhou
We carry out a complete analysis of the problem in the linear--quadratic (LQ) setting and deduce that the optimal feedback control distribution for balancing exploitation and exploration is Gaussian.