1 code implementation • 6 Feb 2024 • Ruoqi Zhang, Ziwei Luo, Jens Sjölund, Thomas B. Schön, Per Mattsson
We show that such an SDE has a solution that we can use to calculate the log probability of the policy, yielding an entropy regularizer that improves the exploration of offline datasets.
no code implementations • 30 Jun 2023 • Ruoqi Zhang, Jens Sjölund
Traditional reinforcement learning methods optimize agents without considering safety, potentially resulting in unintended consequences.
no code implementations • 20 Apr 2023 • Ruoqi Zhang, Per Mattsson, Torbjörn Wigren
While reinforcement learning has made great improvements, state-of-the-art algorithms can still struggle with seemingly simple set-point feedback control problems.
no code implementations • 20 Apr 2023 • Ruoqi Zhang, Per Mattsson, Torbjörn Wigren
This paper argues that three ideas can improve reinforcement learning methods even for highly nonlinear set-point control problems: 1) Make use of a prior feedback controller to aid amplitude exploration.
no code implementations • 20 Apr 2023 • Ruoqi Zhang, Per Mattson, Torbjörn Wigren
As discussed in the paper, this leads to a separation of the observer dynamics to the recurrent neural network part, and the state feedback to the feedback and feedforward network.