1 code implementation • 15 Apr 2024 • Tidiane Camaret Ndir, André Biedenkapp, Noor Awad
In this work, we address the challenge of zero-shot generalization (ZSG) in Reinforcement Learning (RL), where agents must adapt to entirely novel environments without additional training.
1 code implementation • 16 Mar 2024 • Sai Prasanna, Karim Farid, Raghu Rajan, André Biedenkapp
Toward the goal of ZSG to unseen variation in context, we propose the contextual recurrent state-space model (cRSSM), which introduces changes to the world model of the Dreamer (v3) (Hafner et al., 2023).
no code implementations • 9 Feb 2024 • Gresa Shala, André Biedenkapp, Josif Grabocka
We introduce Hierarchical Transformers for Meta-Reinforcement Learning (HTrMRL), a powerful online meta-reinforcement learning approach.
2 code implementations • 7 Jun 2022 • René Sass, Eddie Bergman, André Biedenkapp, Frank Hutter, Marius Lindauer
Automated Machine Learning (AutoML) is used more than ever before to support users in determining efficient hyperparameters, neural architectures, or even full machine learning pipelines.
1 code implementation • 27 May 2022 • Steven Adriaensen, André Biedenkapp, Gresa Shala, Noor Awad, Theresa Eimer, Marius Lindauer, Frank Hutter
The performance of an algorithm often critically depends on its parameter configuration.
1 code implementation • 9 Feb 2022 • Carolin Benjamins, Theresa Eimer, Frederik Schubert, Aditya Mohan, Sebastian Döhler, André Biedenkapp, Bodo Rosenhahn, Frank Hutter, Marius Lindauer
While Reinforcement Learning ( RL) has made great strides towards solving increasingly complicated problems, many algorithms are still brittle to even slight environmental changes.
1 code implementation • 7 Feb 2022 • André Biedenkapp, Nguyen Dang, Martin S. Krejca, Frank Hutter, Carola Doerr
We extend this benchmark by analyzing optimal control policies that can select the parameters only from a given portfolio of possible values.
no code implementations • 11 Jan 2022 • Jack Parker-Holder, Raghu Rajan, Xingyou Song, André Biedenkapp, Yingjie Miao, Theresa Eimer, Baohe Zhang, Vu Nguyen, Roberto Calandra, Aleksandra Faust, Frank Hutter, Marius Lindauer
The combination of Reinforcement Learning (RL) with deep learning has led to a series of impressive feats, with many believing (deep) RL provides a path towards generally capable agents.
1 code implementation • 5 Oct 2021 • Carolin Benjamins, Theresa Eimer, Frederik Schubert, André Biedenkapp, Bodo Rosenhahn, Frank Hutter, Marius Lindauer
While Reinforcement Learning has made great strides towards solving ever more complicated tasks, many algorithms are still brittle to even slight changes in their environment.
1 code implementation • 20 Sep 2021 • Marius Lindauer, Katharina Eggensperger, Matthias Feurer, André Biedenkapp, Difan Deng, Carolin Benjamins, Tim Ruhopf, René Sass, Frank Hutter
Algorithm parameters, in particular hyperparameters of machine learning algorithms, can substantially impact their performance.
1 code implementation • 9 Jun 2021 • André Biedenkapp, Raghu Rajan, Frank Hutter, Marius Lindauer
Reinforcement learning is a powerful approach to learn behaviour through interactions with an environment.
1 code implementation • 9 Jun 2021 • Theresa Eimer, André Biedenkapp, Frank Hutter, Marius Lindauer
Reinforcement learning (RL) has made a lot of advances for solving a single problem in a given environment; but learning policies that generalize to unseen variations of a problem remains challenging.
1 code implementation • 18 May 2021 • Theresa Eimer, André Biedenkapp, Maximilian Reimer, Steven Adriaensen, Frank Hutter, Marius Lindauer
Dynamic Algorithm Configuration (DAC) aims to dynamically control a target algorithm's hyperparameters in order to improve its performance.
1 code implementation • 26 Feb 2021 • Baohe Zhang, Raghu Rajan, Luis Pineda, Nathan Lambert, André Biedenkapp, Kurtland Chua, Frank Hutter, Roberto Calandra
We demonstrate that this problem can be tackled effectively with automated HPO, which we demonstrate to yield significantly improved performance compared to human experts.
Hyperparameter Optimization Model-based Reinforcement Learning +2
no code implementations • 5 Feb 2021 • Samuel Müller, André Biedenkapp, Frank Hutter
To do this, we optimize the loss of the next training step.
no code implementations • 28 Sep 2020 • Raghu Rajan, Jessica Lizeth Borja Diaz, Suresh Guttikonda, Fabio Ferreira, André Biedenkapp, Frank Hutter
We present MDP Playground, an efficient benchmark for Reinforcement Learning (RL) algorithms with various dimensions of hardness that can be controlled independently to challenge algorithms in different ways and to obtain varying degrees of hardness in generated environments.
1 code implementation • ICLR 2021 • Jörg K. H. Franke, Gregor Köhler, André Biedenkapp, Frank Hutter
Despite significant progress in challenging problems across various domains, applying state-of-the-art deep reinforcement learning (RL) algorithms remains challenging due to their sensitivity to the choice of hyperparameters.
1 code implementation • 15 Jun 2020 • David Speck, André Biedenkapp, Frank Hutter, Robert Mattmüller, Marius Lindauer
We show that dynamic algorithm configuration can be used for dynamic heuristic selection which takes into account the internal search dynamics of a planning system.
1 code implementation • 1 Jun 2020 • André Biedenkapp, H. Furkan Bozkurt, Theresa Eimer, Frank Hutter, Marius Lindauer
The performance of many algorithms in the fields of hard combinatorial problem solving, machine learning or AI in general depends on parameter tuning.
1 code implementation • 17 Sep 2019 • Raghu Rajan, Jessica Lizeth Borja Diaz, Suresh Guttikonda, Fabio Ferreira, André Biedenkapp, Jan Ole von Hartz, Frank Hutter
We define a parameterised collection of fast-to-run toy environments in OpenAI Gym by varying these dimensions and propose to use these to understand agents better.
no code implementations • 19 Aug 2019 • Marius Lindauer, Matthias Feurer, Katharina Eggensperger, André Biedenkapp, Frank Hutter
Bayesian Optimization (BO) is a common approach for hyperparameter optimization (HPO) in automated machine learning.
1 code implementation • 16 Aug 2019 • Marius Lindauer, Katharina Eggensperger, Matthias Feurer, André Biedenkapp, Joshua Marben, Philipp Müller, Frank Hutter
Hyperparameter optimization and neural architecture search can become prohibitively expensive for regular black-box Bayesian optimization because the training and evaluation of a single model can easily take several hours.
no code implementations • 18 Jun 2019 • André Biedenkapp, H. Furkan Bozkurt, Frank Hutter, Marius Lindauer
The performance of many algorithms in the fields of hard combinatorial problem solving, machine learning or AI in general depends on tuned hyperparameter configurations.