no code implementations • ICML 2020 • Ashley Edwards, Himanshu Sahni, Rosanne Liu, Jane Hung, Ankit Jain, Rui Wang, Adrien Ecoffet, Thomas Miconi, Charles Isbell, Jason Yosinski
In this paper, we introduce a novel form of a value function, $Q(s, s')$, that expresses the utility of transitioning from a state $s$ to a neighboring state $s'$ and then acting optimally thereafter.
no code implementations • 27 Oct 2022 • Michael L. Littman, Ifeoma Ajunwa, Guy Berger, Craig Boutilier, Morgan Currie, Finale Doshi-Velez, Gillian Hadfield, Michael C. Horowitz, Charles Isbell, Hiroaki Kitano, Karen Levy, Terah Lyons, Melanie Mitchell, Julie Shah, Steven Sloman, Shannon Vallor, Toby Walsh
In September 2021, the "One Hundred Year Study on Artificial Intelligence" project (AI100) issued the second report of its planned long-term periodic assessment of artificial intelligence (AI) and its impact on society.
no code implementations • 10 Mar 2021 • Himanshu Sahni, Charles Isbell
We also show that the agent's internal representation of the surroundings, a live mental map, can be used for control in two partially observable reinforcement learning tasks.
no code implementations • 10 Jun 2020 • Shray Bansal, Jin Xu, Ayanna Howard, Charles Isbell
We showed that using a Bayesian approach to infer the equilibrium enables the robot to complete the task with less than half the number of collisions while also reducing the task execution time as compared to the best baseline.
no code implementations • 2 May 2020 • Shray Bansal, Rhys Newbury, Wesley Chan, Akansel Cosgun, Aimee Allen, Dana Kulić, Tom Drummond, Charles Isbell
We compare two robot modes in a shared table pick-and-place task: (1) Task-oriented: the robot only takes actions to further its own task objective and (2) Supportive: the robot sometimes prefers supportive actions to task-oriented ones when they reduce future goal-conflicts.
1 code implementation • 21 Feb 2020 • Ashley D. Edwards, Himanshu Sahni, Rosanne Liu, Jane Hung, Ankit Jain, Rui Wang, Adrien Ecoffet, Thomas Miconi, Charles Isbell, Jason Yosinski
In this paper, we introduce a novel form of value function, $Q(s, s')$, that expresses the utility of transitioning from a state $s$ to a neighboring state $s'$ and then acting optimally thereafter.
1 code implementation • 15 Feb 2020 • Yannick Schroecker, Charles Isbell
This work considers two distinct settings: imitation learning and goal-conditioned reinforcement learning.
no code implementations • 30 Nov 2017 • Himanshu Sahni, Saurabh Kumar, Farhan Tejani, Charles Isbell
We present a differentiable framework capable of learning a wide variety of compositions of simple policies that we call skills.
no code implementations • 24 May 2017 • Himanshu Sahni, Saurabh Kumar, Farhan Tejani, Yannick Schroecker, Charles Isbell
To address this issue, we develop a framework through which a deep RL agent learns to generalize policies from smaller, simpler domains to more complex ones using a recurrent attention mechanism.
no code implementations • 14 Apr 2017 • Michael L. Littman, Ufuk Topcu, Jie Fu, Charles Isbell, Min Wen, James Macglashan
We propose a new task-specification language for Markov decision processes that is designed to be an improvement over reward functions by being environment independent.
no code implementations • 12 Aug 2016 • Ashley Edwards, Charles Isbell, Atsuo Takanishi
Reinforcement learning problems are often described through rewards that indicate if an agent has completed some task.