no code implementations • 29 Apr 2024 • Xingyuan Zhang, Philip Becker-Ehmck, Patrick van der Smagt, Maximilian Karl
In this paper, we study Imitation Learning from Observation with pretrained models and find existing approaches such as BCO and AIME face knowledge barriers, specifically the Embodiment Knowledge Barrier (EKB) and the Demonstration Knowledge Barrier (DKB), greatly limiting their performance.
1 code implementation • NeurIPS 2023 • Xingyuan Zhang, Philip Becker-Ehmck, Patrick van der Smagt, Maximilian Karl
Our method is "zero-shot" in the sense that it does not require further training for the world model or online interactions with the environment after given the demonstration.
no code implementations • ICML Workshop URL 2021 • Philip Becker-Ehmck, Maximilian Karl, Jan Peters, Patrick van der Smagt
We show that while such an agent is still novelty seeking, i. e. interested in exploring the whole state space, it focuses on exploration where its perceived influence is greater, avoiding areas of greater stochasticity or traps that limit its control.
1 code implementation • 19 Mar 2020 • Philip Becker-Ehmck, Maximilian Karl, Jan Peters, Patrick van der Smagt
Learning to control robots without requiring engineered models has been a long-term goal, promising diverse and novel applications.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • 2 Nov 2019 • Neha Das, Maximilian Karl, Philip Becker-Ehmck, Patrick van der Smagt
Learning a model of dynamics from high-dimensional images can be a core ingredient for success in many applications across different domains, especially in sequential decision making.
no code implementations • 29 May 2019 • Philip Becker-Ehmck, Jan Peters, Patrick van der Smagt
System identification of complex and nonlinear systems is a central problem for model predictive control and model-based reinforcement learning.
no code implementations • 13 Oct 2017 • Maximilian Karl, Maximilian Soelch, Philip Becker-Ehmck, Djalel Benbouzid, Patrick van der Smagt, Justin Bayer
We introduce a methodology for efficiently computing a lower bound to empowerment, allowing it to be used as an unsupervised cost function for policy learning in real-time control.