Search Results for author: Xiyue Peng

Found 1 papers, 0 papers with code

Adversarially Trained Actor Critic for offline CMDPs

no code implementations1 Jan 2024 Honghao Wei, Xiyue Peng, Xin Liu, Arnob Ghosh

Theoretically, we demonstrate that when the actor employs a no-regret optimization oracle, SATAC achieves two guarantees: (i) For the first time in the offline RL setting, we establish that SATAC can produce a policy that outperforms the behavior policy while maintaining the same level of safety, which is critical to designing an algorithm for offline RL.

Continuous Control Offline RL +1

Cannot find the paper you are looking for? You can Submit a new open access paper.