Search Results for author: Ramki Gummadi

Found 5 papers, 0 papers with code

A Parametric Class of Approximate Gradient Updates for Policy Optimization

no code implementations • 17 Jun 2022 • Ramki Gummadi, Saurabh Kumar, Junfeng Wen, Dale Schuurmans

Approaches to policy optimization have been motivated from diverse principles, based on how the parametric model is interpreted (e. g. value versus policy representation) or how the learning objective is formulated, yet they share a common goal of maximizing expected return.

Paper
Add Code

Understanding and Leveraging Overparameterization in Recursive Value Estimation

no code implementations • ICLR 2022 • Chenjun Xiao, Bo Dai, Jincheng Mei, Oscar A Ramirez, Ramki Gummadi, Chris Harris, Dale Schuurmans

To better understand the utility of deep models in RL we present an analysis of recursive value estimation using overparameterized linear representations that provides useful, transferable findings.

Reinforcement Learning (RL) Value prediction

Paper
Add Code

Characterizing the Gap Between Actor-Critic and Policy Gradient

no code implementations • 13 Jun 2021 • Junfeng Wen, Saurabh Kumar, Ramki Gummadi, Dale Schuurmans

Actor-critic (AC) methods are ubiquitous in reinforcement learning.

Paper
Add Code

Surrogate Objectives for Batch Policy Optimization in One-step Decision Making

no code implementations • NeurIPS 2019 • Minmin Chen, Ramki Gummadi, Chris Harris, Dale Schuurmans

We investigate batch policy optimization for cost-sensitive classification and contextual bandits---two related tasks that obviate exploration but require generalizing from observed rewards to action selections in unseen contexts.

Decision Making Multi-Armed Bandits

Paper
Add Code

Variational Rejection Sampling

no code implementations • 5 Apr 2018 • Aditya Grover, Ramki Gummadi, Miguel Lazaro-Gredilla, Dale Schuurmans, Stefano Ermon

Learning latent variable models with stochastic variational inference is challenging when the approximate posterior is far from the true posterior, due to high variance in the gradient estimates.

Variational Inference

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.