no code implementations • 8 Jan 2023 • Phillip J. K. Christoffersen, Andrew C. Li, Rodrigo Toro Icarte, Sheila A. McIlraith
Recent work has leveraged Knowledge Representation (KR) to provide a symbolic abstraction of aspects of the state that summarize reward-relevant properties of the state-action history and support learning a Markovian decomposition of the problem in terms of an automaton over the KR.
no code implementations • 20 Nov 2022 • Andrew C. Li, Zizhao Chen, Pashootan Vaezipoor, Toryn Q. Klassen, Rodrigo Toro Icarte, Sheila A. McIlraith
Natural and formal languages provide an effective mechanism for humans to specify instructions and reward functions.
1 code implementation • 3 Jun 2022 • Andrew C. Li, Pashootan Vaezipoor, Rodrigo Toro Icarte, Sheila A. McIlraith
Deep reinforcement learning has shown promise in discrete domains requiring complex reasoning, including games such as Chess, Go, and Hanabi.
no code implementations • 17 Dec 2021 • Rodrigo Toro Icarte, Ethan Waldie, Toryn Q. Klassen, Richard Valenzano, Margarita P. Castro, Sheila A. McIlraith
Here we show that RMs can be learned from experience, instead of being specified by the user, and that the resulting problem decomposition can be used to effectively solve partially observable RL problems.
Partially Observable Reinforcement Learning Problem Decomposition +2
no code implementations • 4 Jun 2021 • Parand Alizadeh Alamdari, Toryn Q. Klassen, Rodrigo Toro Icarte, Sheila A. McIlraith
We endow RL agents with the ability to contemplate such impact by augmenting their reward based on expectation of future return by others in the environment, providing different criteria for characterizing impact.
1 code implementation • 31 May 2021 • Maayan Shvo, Zhiming Hu, Rodrigo Toro Icarte, Iqbal Mohomed, Allan Jepson, Sheila A. McIlraith
We introduce an RL-based framework for learning to accomplish tasks in mobile apps.
1 code implementation • 13 Feb 2021 • Pashootan Vaezipoor, Andrew Li, Rodrigo Toro Icarte, Sheila Mcilraith
We address the problem of teaching a deep reinforcement learning (RL) agent to follow instructions in multi-task environments.
1 code implementation • 6 Oct 2020 • Maayan Shvo, Andrew C. Li, Rodrigo Toro Icarte, Sheila A. McIlraith
Our automata-based classifiers are interpretable---supporting explanation, counterfactual reasoning, and human-in-the-loop modification---and have strong empirical performance.
3 code implementations • 6 Oct 2020 • Rodrigo Toro Icarte, Toryn Q. Klassen, Richard Valenzano, Sheila A. McIlraith
First, we propose reward machines, a type of finite state machine that supports the specification of reward functions while exposing reward function structure.
no code implementations • 5 Oct 2020 • Rodrigo Toro Icarte, Richard Valenzano, Toryn Q. Klassen, Phillip Christoffersen, Amir-Massoud Farahmand, Sheila A. McIlraith
Learning memoryless policies is efficient and optimal in fully observable environments.
Partially Observable Reinforcement Learning reinforcement-learning +1
1 code implementation • NeurIPS 2019 • Rodrigo Toro Icarte, Ethan Waldie, Toryn Klassen, Rick Valenzano, Margarita Castro, Sheila Mcilraith
Reward Machines (RMs), originally proposed for specifying problems in Reinforcement Learning (RL), provide a structured, automata-based representation of a reward function that allows an agent to decompose problems into subproblems that can be efficiently learned using off-policy learning.
Partially Observable Reinforcement Learning Problem Decomposition +2
1 code implementation • ICML 2018 • Rodrigo Toro Icarte, Toryn Klassen, Richard Valenzano, Sheila Mcilraith
In this paper we propose Reward Machines {—} a type of finite state machine that supports the specification of reward functions while exposing reward function structure to the learner and supporting decomposition.
1 code implementation • 24 May 2017 • Rodrigo Toro Icarte, Jorge A. Baier, Cristian Ruz, Alvaro Soto
Consequently, a main conclusion of this work is that general-purpose commonsense ontologies improve performance on visual reasoning tasks when properly filtered to select meaningful visual relations.