no code implementations • 14 Apr 2024 • Simon Eisenmann, Daniel Hein, Steffen Udluft, Thomas A. Runkler
The policy is optimized with a gradient-free optimization scheme using the return estimate given by the model as the fitness function.
no code implementations • 11 Aug 2023 • Marc Weber, Phillip Swazinna, Daniel Hein, Steffen Udluft, Volkmar Sterzing
Offline reinforcement learning provides a viable approach to obtain advanced control strategies for dynamical systems, in particular when direct interaction with the environment is not available.
no code implementations • 9 Jun 2022 • Simon Wiedemann, Daniel Hein, Steffen Udluft, Christian Mendl
We present a full implementation and simulation of a novel quantum reinforcement learning method.
1 code implementation • 14 Jan 2022 • Phillip Swazinna, Steffen Udluft, Daniel Hein, Thomas Runkler
Offline reinforcement learning (RL) Algorithms are often designed with environments such as MuJoCo in mind, in which the planning horizon is extremely long and no noise exists.
no code implementations • 30 Aug 2021 • Daniel Hein, Daniel Labisch
In this paper, genetic programming reinforcement learning (GPRL) is utilized to generate human-interpretable control policies for a Chylla-Haase polymerization reactor.
1 code implementation • 12 Jul 2021 • Phillip Swazinna, Steffen Udluft, Daniel Hein, Thomas Runkler
In offline reinforcement learning, a policy needs to be learned from a single pre-collected dataset.
no code implementations • 20 Jul 2020 • Daniel Hein, Steffen Limmer, Thomas A. Runkler
In this paper, three recently introduced reinforcement learning (RL) methods are used to generate human-interpretable policies for the cart-pole balancing benchmark.
no code implementations • 29 Apr 2018 • Daniel Hein, Steffen Udluft, Thomas A. Runkler
Autonomously training interpretable control strategies, called policies, using pre-existing plant trajectory data is of great interest in industrial applications.
no code implementations • 12 Dec 2017 • Daniel Hein, Steffen Udluft, Thomas A. Runkler
Here we introduce the genetic programming for reinforcement learning (GPRL) approach based on model-based batch reinforcement learning and genetic programming, which autonomously learns policy equations from pre-existing default state-action trajectory samples.
2 code implementations • 27 Sep 2017 • Daniel Hein, Stefan Depeweg, Michel Tokic, Steffen Udluft, Alexander Hentschel, Thomas A. Runkler, Volkmar Sterzing
On one hand, these benchmarks are designed to provide interpretable RL training scenarios and detailed insight into the learning process of the method on hand.
no code implementations • 20 May 2017 • Daniel Hein, Steffen Udluft, Michel Tokic, Alexander Hentschel, Thomas A. Runkler, Volkmar Sterzing
The Particle Swarm Optimization Policy (PSO-P) has been recently introduced and proven to produce remarkable results on interacting with academic reinforcement learning benchmarks in an off-policy, batch-based setting.
no code implementations • 19 Oct 2016 • Daniel Hein, Alexander Hentschel, Thomas Runkler, Steffen Udluft
To the best of our knowledge, this approach is the first to relate self-organizing fuzzy controllers to model-based batch RL.
no code implementations • 12 Oct 2016 • Daniel Hein, Alexander Hentschel, Volkmar Sterzing, Michel Tokic, Steffen Udluft
A novel reinforcement learning benchmark, called Industrial Benchmark, is introduced.