Search Results for author: Supratik Paul

Found 7 papers, 2 papers with code

Embedding Synthetic Off-Policy Experience for Autonomous Driving via Zero-Shot Curricula

no code implementations • 2 Dec 2022 • Eli Bronstein, Sirish Srinivasan, Supratik Paul, Aman Sinha, Matthew O'Kelly, Payam Nikdel, Shimon Whiteson

However, this approach produces agents that do not perform robustly in safety-critical settings, an issue that cannot be addressed by simply adding more data to the training set - we show that an agent trained using only a 10% subset of the data performs just as well as an agent trained on the entire dataset.

Autonomous Driving Imitation Learning +1

Paper
Add Code

Hierarchical Model-Based Imitation Learning for Planning in Autonomous Driving

no code implementations • 18 Oct 2022 • Eli Bronstein, Mark Palatucci, Dominik Notz, Brandyn White, Alex Kuefler, Yiren Lu, Supratik Paul, Payam Nikdel, Paul Mougin, Hongge Chen, Justin Fu, Austin Abrams, Punit Shah, Evan Racah, Benjamin Frenkel, Shimon Whiteson, Dragomir Anguelov

We demonstrate the first large-scale application of model-based generative adversarial imitation learning (MGAIL) to the task of dense urban self-driving.

Autonomous Driving Imitation Learning +1

Paper
Add Code

Fast Efficient Hyperparameter Tuning for Policy Gradient Methods

1 code implementation • NeurIPS 2019 • Supratik Paul, Vitaly Kurin, Shimon Whiteson

The main idea is to use existing trajectories sampled by the policy gradient method to optimise a one-step improvement objective, yielding a sample and computationally efficient algorithm that is easy to implement.

Policy Gradient Methods

Paper
Code

Fast Efficient Hyperparameter Tuning for Policy Gradients

1 code implementation • 18 Feb 2019 • Supratik Paul, Vitaly Kurin, Shimon Whiteson

Meta-Learning Policy Gradient Methods

Paper
Code

Learning from Demonstration in the Wild

no code implementations • 8 Nov 2018 • Feryal Behbahani, Kyriacos Shiarlis, Xi Chen, Vitaly Kurin, Sudhanshu Kasewa, Ciprian Stirbu, João Gomes, Supratik Paul, Frans A. Oliehoek, João Messias, Shimon Whiteson

Learning from demonstration (LfD) is useful in settings where hand-coding behaviour or a reward function is impractical.

Paper
Add Code

Fingerprint Policy Optimisation for Robust Reinforcement Learning

no code implementations • 27 May 2018 • Supratik Paul, Michael A. Osborne, Shimon Whiteson

Policy gradient methods ignore the potential value of adjusting environment variables: unobservable state features that are randomly determined by the environment in a physical setting, but are controllable in a simulator.

Bayesian Optimisation Continuous Control +3

Paper
Add Code

Alternating Optimisation and Quadrature for Robust Control

no code implementations • 24 May 2016 • Supratik Paul, Konstantinos Chatzilygeroudis, Kamil Ciosek, Jean-Baptiste Mouret, Michael A. Osborne, Shimon Whiteson

ALOQ is robust to the presence of significant rare events, which may not be observable under random sampling, but play a substantial role in determining the optimal policy.

Bayesian Optimisation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.