Search Results for author: Dibya Ghosh

Found 17 papers, 8 papers with code

Accelerating Exploration with Unlabeled Prior Data

1 code implementation NeurIPS 2023 Qiyang Li, Jason Zhang, Dibya Ghosh, Amy Zhang, Sergey Levine

Learning to solve tasks from a sparse reward signal is a major challenge for standard reinforcement learning (RL) algorithms.

Reinforcement Learning (RL)

Robotic Offline RL from Internet Videos via Value-Function Pre-Training

no code implementations22 Sep 2023 Chethan Bhateja, Derek Guo, Dibya Ghosh, Anikait Singh, Manan Tomar, Quan Vuong, Yevgen Chebotar, Sergey Levine, Aviral Kumar

Our system, called V-PTR, combines the benefits of pre-training on video data with robotic offline RL approaches that train on diverse robot data, resulting in value functions and policies for manipulation tasks that perform better, act robustly, and generalize broadly.

Offline RL Reinforcement Learning (RL)

HIQL: Offline Goal-Conditioned RL with Latent States as Actions

1 code implementation NeurIPS 2023 Seohong Park, Dibya Ghosh, Benjamin Eysenbach, Sergey Levine

This structure can be very useful, as assessing the quality of actions for nearby goals is typically easier than for more distant goals.

Reinforcement Learning (RL) Unsupervised Pre-training

Reinforcement Learning from Passive Data via Latent Intentions

1 code implementation10 Apr 2023 Dibya Ghosh, Chethan Bhateja, Sergey Levine

Passive observational data, such as human videos, is abundant and rich in information, yet remains largely untapped by current RL methods.

reinforcement-learning Value prediction

Distributionally Adaptive Meta Reinforcement Learning

no code implementations6 Oct 2022 Anurag Ajay, Abhishek Gupta, Dibya Ghosh, Sergey Levine, Pulkit Agrawal

In this work, we develop a framework for meta-RL algorithms that are able to behave appropriately under test-time distribution shifts in the space of tasks.

Meta Reinforcement Learning reinforcement-learning +1

Offline RL Policies Should be Trained to be Adaptive

no code implementations5 Jul 2022 Dibya Ghosh, Anurag Ajay, Pulkit Agrawal, Sergey Levine

Offline RL algorithms must account for the fact that the dataset they are provided may leave many facets of the environment unknown.

Offline RL

Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning

1 code implementation ICLR 2021 Aviral Kumar, Rishabh Agarwal, Dibya Ghosh, Sergey Levine

We identify an implicit under-parameterization phenomenon in value-based deep RL methods that use bootstrapping: when value functions, approximated using deep neural networks, are trained with gradient descent using iterated regression onto target values generated by previous instances of the value network, more gradient updates decrease the expressivity of the current value network.

reinforcement-learning Reinforcement Learning (RL)

Representations for Stable Off-Policy Reinforcement Learning

no code implementations ICML 2020 Dibya Ghosh, Marc G. Bellemare

Reinforcement learning with function approximation can be unstable and even divergent, especially when combined with off-policy learning and Bellman updates.

reinforcement-learning Reinforcement Learning (RL) +1

An operator view of policy gradient methods

no code implementations NeurIPS 2020 Dibya Ghosh, Marlos C. Machado, Nicolas Le Roux

We cast policy gradient methods as the repeated application of two operators: a policy improvement operator $\mathcal{I}$, which maps any policy $\pi$ to a better one $\mathcal{I}\pi$, and a projection operator $\mathcal{P}$, which finds the best approximation of $\mathcal{I}\pi$ in the set of realizable policies.

Policy Gradient Methods

On Catastrophic Interference in Atari 2600 Games

1 code implementation28 Feb 2020 William Fedus, Dibya Ghosh, John D. Martin, Marc G. Bellemare, Yoshua Bengio, Hugo Larochelle

Our study provides a clear empirical link between catastrophic interference and sample efficiency in reinforcement learning.

Atari Games reinforcement-learning +1

Learning to Reach Goals via Iterated Supervised Learning

2 code implementations ICLR 2021 Dibya Ghosh, Abhishek Gupta, Ashwin Reddy, Justin Fu, Coline Devin, Benjamin Eysenbach, Sergey Levine

Current reinforcement learning (RL) algorithms can be brittle and difficult to use, especially when learning goal-reaching behaviors from sparse rewards.

Multi-Goal Reinforcement Learning Reinforcement Learning (RL)

Learning to Reach Goals Without Reinforcement Learning

no code implementations25 Sep 2019 Dibya Ghosh, Abhishek Gupta, Justin Fu, Ashwin Reddy, Coline Devin, Benjamin Eysenbach, Sergey Levine

By maximizing the likelihood of good actions provided by an expert demonstrator, supervised imitation learning can produce effective policies without the algorithmic complexities and optimization challenges of reinforcement learning, at the cost of requiring an expert demonstrator -- typically a person -- to provide the demonstrations.

Imitation Learning reinforcement-learning +1

Learning Actionable Representations with Goal Conditioned Policies

no code implementations ICLR 2019 Dibya Ghosh, Abhishek Gupta, Sergey Levine

Most prior work on representation learning has focused on generative approaches, learning representations that capture all the underlying factors of variation in the observation space in a more disentangled or well-ordered manner.

Decision Making Hierarchical Reinforcement Learning +3

Learning Actionable Representations with Goal-Conditioned Policies

1 code implementation19 Nov 2018 Dibya Ghosh, Abhishek Gupta, Sergey Levine

Most prior work on representation learning has focused on generative approaches, learning representations that capture all underlying factors of variation in the observation space in a more disentangled or well-ordered manner.

Decision Making Hierarchical Reinforcement Learning +3

Variational Inverse Control with Events: A General Framework for Data-Driven Reward Definition

no code implementations NeurIPS 2018 Justin Fu, Avi Singh, Dibya Ghosh, Larry Yang, Sergey Levine

We propose variational inverse control with events (VICE), which generalizes inverse reinforcement learning methods to cases where full demonstrations are not needed, such as when only samples of desired goal states are available.

Continuous Control reinforcement-learning +1

Divide-and-Conquer Reinforcement Learning

1 code implementation ICLR 2018 Dibya Ghosh, Avi Singh, Aravind Rajeswaran, Vikash Kumar, Sergey Levine

In this paper, we develop a novel algorithm that instead partitions the initial state space into "slices", and optimizes an ensemble of policies, each on a different slice.

Policy Gradient Methods reinforcement-learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.