Search Results for author: Chinmaya Kausik

A Framework for Partially Observed Reward-States in RLHF

We show reductions from the the two dominant forms of human feedback in RLHF - cardinal and dueling feedback to PORRL.

Paper
Add Code

Motivated by this, we study supervised denoising and noisy-input regression under distribution shift.

Paper
Add Code

Evaluating and optimizing policies in the presence of unobserved confounders is a problem of growing interest in offline reinforcement learning.

Paper
Add Code

We present an algorithm for learning mixtures of Markov chains and Markov decision processes (MDPs) from short unlabeled trajectories.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.