Search Results for author: Ishan Durugkar

Found 14 papers, 4 papers with code

N-Agent Ad Hoc Teamwork

no code implementations • 16 Apr 2024 • Caroline Wang, Arrasy Rahman, Ishan Durugkar, Elad Liebman, Peter Stone

POAM is a policy gradient, multi-agent reinforcement learning approach to the NAHT problem, that enables adaptation to diverse teammate behaviors by learning representations of teammate behaviors.

Autonomous Driving Multi-agent Reinforcement Learning +4

Paper
Add Code

$f$-Policy Gradients: A General Framework for Goal Conditioned RL using $f$-Divergences

no code implementations • 10 Oct 2023 • Siddhant Agarwal, Ishan Durugkar, Peter Stone, Amy Zhang

We further introduce an entropy-regularized policy optimization objective, that we call $state$-MaxEnt RL (or $s$-MaxEnt RL) as a special case of our objective.

Efficient Exploration Policy Gradient Methods +1

Paper
Add Code

ABC: Adversarial Behavioral Cloning for Offline Mode-Seeking Imitation Learning

no code implementations • 8 Nov 2022 • Eddy Hudson, Ishan Durugkar, Garrett Warnell, Peter Stone

Given a dataset of expert agent interactions with an environment of interest, a viable method to extract an effective agent policy is to estimate the maximum likelihood policy indicated by this data.

Generative Adversarial Network Imitation Learning

Paper
Add Code

DM$^2$: Decentralized Multi-Agent Reinforcement Learning for Distribution Matching

1 code implementation • 1 Jun 2022 • Caroline Wang, Ishan Durugkar, Elad Liebman, Peter Stone

The theoretical analysis shows that under certain conditions, each agent minimizing its individual distribution mismatch allows the convergence to the joint policy that generated the target distribution.

Multi-agent Reinforcement Learning reinforcement-learning +2

Paper
Code

Wasserstein Distance Maximizing Intrinsic Control

no code implementations • 28 Oct 2021 • Ishan Durugkar, Steven Hansen, Stephen Spencer, Volodymyr Mnih

This paper deals with the problem of learning a skill-conditioned policy that acts meaningfully in the absence of a reward signal.

Paper
Add Code

Adversarial Intrinsic Motivation for Reinforcement Learning

1 code implementation • NeurIPS 2021 • Ishan Durugkar, Mauricio Tec, Scott Niekum, Peter Stone

In this paper, we investigate whether one such objective, the Wasserstein-1 distance between a policy's state visitation distribution and a target distribution, can be utilized effectively for reinforcement learning (RL) tasks.

Multi-Goal Reinforcement Learning reinforcement-learning +1

Paper
Code

Reducing Sampling Error in Batch Temporal Difference Learning

no code implementations • ICML 2020 • Brahma Pavse, Ishan Durugkar, Josiah Hanna, Peter Stone

In this batch setting, we show that TD(0) may converge to an inaccurate value function because the update following an action is weighted according to the number of times that action occurred in the batch -- not the true probability of the action under the given policy.

Paper
Add Code

An Imitation from Observation Approach to Transfer Learning with Dynamics Mismatch

no code implementations • NeurIPS 2020 • Siddharth Desai, Ishan Durugkar, Haresh Karnan, Garrett Warnell, Josiah Hanna, Peter Stone

We examine the problem of transferring a policy learned in a source environment to a target environment with different dynamics, particularly in the case where it is critical to reduce the amount of interaction with the target environment during learning.

Transfer Learning

Paper
Add Code

HR-TD: A Regularized TD Method to Avoid Over-Generalization

no code implementations • ICLR 2019 • Ishan Durugkar, Bo Liu, Peter Stone

Temporal Difference learning with function approximation has been widely used recently and has led to several successful results.

Paper
Add Code

Multi-Preference Actor Critic

no code implementations • 5 Apr 2019 • Ishan Durugkar, Matthew Hausknecht, Adith Swaminathan, Patrick MacAlpine

Policy gradient algorithms typically combine discounted future rewards with an estimated value function, to compute the direction and magnitude of parameter updates.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

TD Learning with Constrained Gradients

no code implementations • ICLR 2018 • Ishan Durugkar, Peter Stone

In this work we propose a constraint on the TD update that minimizes change to the target values.

Q-Learning

Paper
Add Code

Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning

7 code implementations • ICLR 2018 • Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, Andrew McCallum

Knowledge bases (KB), both automatically and manually constructed, are often incomplete --- many valid facts can be inferred from the KB by synthesizing existing information.

Navigate Relation +1

309

Paper
Code

Generative Multi-Adversarial Networks

1 code implementation • 5 Nov 2016 • Ishan Durugkar, Ian Gemp, Sridhar Mahadevan

Generative adversarial networks (GANs) are a framework for producing a generative model by way of a two-player minimax game.

Ranked #67 on Image Generation on CIFAR-10 (Inception score metric)

Image Generation

Paper
Code

Inverting Variational Autoencoders for Improved Generative Accuracy

no code implementations • 21 Aug 2016 • Ian Gemp, Ishan Durugkar, Mario Parente, M. Darby Dyar, Sridhar Mahadevan

Recent advances in semi-supervised learning with deep generative models have shown promise in generalizing from small labeled datasets ($\mathbf{x},\mathbf{y}$) to large unlabeled ones ($\mathbf{x}$).

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.