Search Results for author: Prakash Panangaden

Found 12 papers, 4 papers with code

Conditions on Preference Relations that Guarantee the Existence of Optimal Policies

no code implementations • 3 Nov 2023 • Jonathan Colaço Carr, Prakash Panangaden, Doina Precup

Current results guaranteeing the existence of optimal policies in LfPF problems assume that both the preferences and transition dynamics are determined by a Markov Decision Process.

Decision Making

Paper
Add Code

A Kernel Perspective on Behavioural Metrics for Markov Decision Processes

no code implementations • 5 Oct 2023 • Pablo Samuel Castro, Tyler Kastner, Prakash Panangaden, Mark Rowland

Behavioural metrics have been shown to be an effective mechanism for constructing representations in reinforcement learning.

reinforcement-learning

Paper
Add Code

Policy Gradient Methods in the Presence of Symmetries and State Abstractions

2 code implementations • 9 May 2023 • Prakash Panangaden, Sahand Rezaei-Shoshtari, Rosie Zhao, David Meger, Doina Precup

Our policy gradient results allow for leveraging approximate symmetries of the environment for policy optimization.

Continuous Control Policy Gradient Methods +2

Paper
Code

Continuous MDP Homomorphisms and Homomorphic Policy Gradient

1 code implementation • 15 Sep 2022 • Sahand Rezaei-Shoshtari, Rosie Zhao, Prakash Panangaden, David Meger, Doina Precup

Abstraction has been widely studied as a way to improve the efficiency and generalization of reinforcement learning algorithms.

Continuous Control Policy Gradient Methods +2

Paper
Code

Riemannian Diffusion Models

no code implementations • 16 Aug 2022 • Chin-wei Huang, Milad Aghajohari, Avishek Joey Bose, Prakash Panangaden, Aaron Courville

In this work, we generalize continuous-time diffusion models to arbitrary Riemannian manifolds and derive a variational framework for likelihood estimation.

Image Generation

Paper
Add Code

Extracting Weighted Automata for Approximate Minimization in Language Modelling

no code implementations • 5 Jun 2021 • Clara Lacroce, Prakash Panangaden, Guillaume Rabusseau

The objective is to obtain a weighted finite automaton (WFA) that fits within a given size constraint and which mimics the behaviour of the original model while minimizing some notion of distance between the black box and the extracted WFA.

Language Modelling

Paper
Add Code

MICo: Improved representations via sampling-based state similarity for Markov decision processes

2 code implementations • NeurIPS 2021 • Pablo Samuel Castro, Tyler Kastner, Prakash Panangaden, Mark Rowland

We present a new behavioural distance over the state space of a Markov decision process, and demonstrate the use of this distance as an effective means of shaping the learnt representations of deep reinforcement learning agents.

Atari Games reinforcement-learning +1

32,837

Paper
Code

A Study of Policy Gradient on a Class of Exactly Solvable Models

no code implementations • 3 Nov 2020 • Gavin McCracken, Colin Daniels, Rosie Zhao, Anna Brandenberger, Prakash Panangaden, Doina Precup

Policy gradient methods are extensively used in reinforcement learning as a way to optimize expected return.

Policy Gradient Methods

Paper
Add Code

A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms

no code implementations • 27 Mar 2020 • Philip Amortila, Doina Precup, Prakash Panangaden, Marc G. Bellemare

We present a distributional approach to theoretical analyses of reinforcement learning algorithms for constant step-sizes.

Q-Learning reinforcement-learning +1

Paper
Add Code

Latent Variable Modelling with Hyperbolic Normalizing Flows

1 code implementation • ICML 2020 • Avishek Joey Bose, Ariella Smofsky, Renjie Liao, Prakash Panangaden, William L. Hamilton

One effective solution is the use of normalizing flows \cut{defined on Euclidean spaces} to construct flexible posterior distributions.

Density Estimation Variational Inference

Paper
Code

Basis refinement strategies for linear value function approximation in MDPs

no code implementations • NeurIPS 2015 • Gheorghe Comanici, Doina Precup, Prakash Panangaden

We provide a theoretical framework for analyzing basis function construction for linear value function approximation in Markov Decision Processes (MDPs).

Paper
Add Code

Proceedings of the 11th workshop on Quantum Physics and Logic

no code implementations • 28 Dec 2014 • Bob Coecke, Ichiro Hasuo, Prakash Panangaden

The first QPL under the new name Quantum Physics and Logic was held in Reykjavik (2008), followed by Oxford (2009 and 2010), Nijmegen (2011), Brussels (2012) and Barcelona (2013).

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.