Search Results for author: Prashanth L. A

Found 7 papers, 1 papers with code

A policy gradient approach for optimization of smooth risk measures

no code implementations • 22 Feb 2022 • Nithia Vijayan, Prashanth L. A

We propose policy gradient algorithms for solving a risk-sensitive reinforcement learning (RL) problem in on-policy as well as off-policy settings.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Policy Gradient Methods for Distortion Risk Measures

no code implementations • 9 Jul 2021 • Nithia Vijayan, Prashanth L. A

We propose policy gradient algorithms which learn risk-sensitive policies in a reinforcement learning (RL) framework.

Policy Gradient Methods reinforcement-learning +1

Paper
Add Code

Smoothed functional-based gradient algorithms for off-policy reinforcement learning: A non-asymptotic viewpoint

no code implementations • 6 Jan 2021 • Nithia Vijayan, Prashanth L. A

From these results, we infer that the first algorithm converges at a rate that is comparable to the well-known REINFORCE algorithm in an off-policy RL context, while the second algorithm exhibits an improved rate of convergence.

Off-policy evaluation

Paper
Add Code

Non-asymptotic bounds for stochastic optimization with biased noisy gradient oracles

no code implementations • 26 Feb 2020 • Nirav Bhavsar, Prashanth L. A

We introduce biased gradient oracles to capture a setting where the function measurements have an estimation error that can be controlled through a batch size parameter.

Stochastic Optimization

Paper
Add Code

Correlated bandits or: How to minimize mean-squared error online

no code implementations • 8 Feb 2019 • Vinay Praneeth Boda, Prashanth L. A

Motivated by such applications, we formulate the correlated bandit problem, where the objective is to find the arm with the lowest mean-squared error (MSE) in estimating all the arms.

Paper
Add Code

Random directions stochastic approximation with deterministic perturbations

1 code implementation • 8 Aug 2018 • Prashanth L. A, Shalabh Bhatnagar, Nirav Bhavsar, Michael Fu, Steven I. Marcus

We introduce deterministic perturbation schemes for the recently proposed random directions stochastic approximation (RDSA) [17], and propose new first-order and second-order algorithms.

Paper
Code

Policy Gradients for CVaR-Constrained MDPs

no code implementations • 12 May 2014 • Prashanth L. A

We study a risk-constrained version of the stochastic shortest path (SSP) problem, where the risk measure considered is Conditional Value-at-Risk (CVaR).

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.