Safe Exploration is an approach to collect ground truth data by safely interacting with the environment.
We evaluate the resulting algorithm to safely explore the dynamics of an inverted pendulum and to solve a reinforcement learning task on a cart-pole system with safety constraints.
However, these methods typically do not provide any safety guarantees, which prevents their use in safety-critical, real-world applications.
We also prove that our method of enforcing the safety constraints preserves all safe policies from the original environment.
We define safety in terms of an, a priori unknown, safety constraint that depends on states and actions.
Ensuring safety and explainability of machine learning (ML) is a topic of increasing relevance as data-driven applications venture into safety-critical application domains, traditionally committed to high safety standards that are not satisfied with an exclusive testing approach of otherwise inaccessible black-box systems.
Ranked #1 on Classification on XOR
A key challenge in applying reinforcement learning to safety-critical domains is understanding how to balance exploration (needed to attain good performance on the task) with safety (needed to avoid catastrophic failure).
We present Revel, a partially neural reinforcement learning (RL) framework for provably safe exploration in continuous state and action spaces.