no code implementations • 27 Feb 2023 • Ziteng Cheng, Sebastian Jaimungal, Nick Martin
We introduce a distributional method for learning the optimal policy in risk averse Markov decision process with finite state action spaces, latent costs, and stationary dynamics.