Search Results for author: Abhishek Naik

Found 6 papers, 2 papers with code

Average-Reward Learning and Planning with Options

no code implementations NeurIPS 2021 Yi Wan, Abhishek Naik, Richard S. Sutton

We extend the options framework for temporal abstraction in reinforcement learning from discounted Markov decision processes (MDPs) to average-reward MDPs.

reinforcement-learning Reinforcement Learning (RL)

Planning with Expectation Models for Control

no code implementations17 Apr 2021 Katya Kudashkina, Yi Wan, Abhishek Naik, Richard S. Sutton

Our algorithms and experiments are the first to treat MBRL with expectation models in a general setting.

Model-based Reinforcement Learning

MADRaS : Multi Agent Driving Simulator

no code implementations2 Oct 2020 Anirban Santara, Sohan Rudra, Sree Aditya Buridi, Meha Kaushik, Abhishek Naik, Bharat Kaul, Balaraman Ravindran

In this work, we present MADRaS, an open-source multi-agent driving simulator for use in the design and evaluation of motion planning algorithms for autonomous driving.

Autonomous Driving Car Racing +5

Learning and Planning in Average-Reward Markov Decision Processes

1 code implementation29 Jun 2020 Yi Wan, Abhishek Naik, Richard S. Sutton

We introduce learning and planning algorithms for average-reward MDPs, including 1) the first general proven-convergent off-policy model-free control algorithm without reference states, 2) the first proven-convergent off-policy model-free prediction algorithm, and 3) the first off-policy learning algorithm that converges to the actual value function rather than to the value function plus an offset.

Discounted Reinforcement Learning Is Not an Optimization Problem

no code implementations4 Oct 2019 Abhishek Naik, Roshan Shariff, Niko Yasui, Hengshuai Yao, Richard S. Sutton

Discounted reinforcement learning is fundamentally incompatible with function approximation for control in continuing tasks.

Misconceptions reinforcement-learning +1

RAIL: Risk-Averse Imitation Learning

1 code implementation20 Jul 2017 Anirban Santara, Abhishek Naik, Balaraman Ravindran, Dipankar Das, Dheevatsa Mudigere, Sasikanth Avancha, Bharat Kaul

Generative Adversarial Imitation Learning (GAIL) is a state-of-the-art algorithm for learning policies when the expert's behavior is available as a fixed set of trajectories.

Autonomous Driving Continuous Control +1

Cannot find the paper you are looking for? You can Submit a new open access paper.