Learning to flexibly follow task instructions in dynamic environments poses interesting challenges for reinforcement learning agents.
Although deep reinforcement learning has led to breakthroughs in many difficult domains, these successes have required an ever-increasing number of samples, affording only a shrinking segment of the AI community access to their development.
Such representations can immediately be transferred between tasks that share the same types of objects, resulting in agents that require fewer samples to learn a model of a new task.
Online games are a massive entertainment market and network latency is a key aspect of a player's competitive edge.
In our efforts to model the rescuer's mind, we begin with a simple simulated search and rescue task in Minecraft with human participants.
For example, psychologists are less interested in having a model that predicts human behavior with high accuracy and more concerned with identifying differences between actions that lead to divergent human behavior.
A generally intelligent agent requires an open-scope world model: one rich enough to tackle any of the wide range of tasks it may be asked to solve over its operational lifetime.
When generating technical instructions, it is often convenient to describe complex objects in the world at different levels of abstraction.