Goal Randomization for Playing Text-based Games without a Reward Function

29 Sep 2021 · Meng Fang, Yunqiu Xu, Yali Du, Ling Chen, Chengqi Zhang ·

Playing text-based games requires language understanding and sequential decision making. The objective of a reinforcement learning agent is to behave so as to maximise the sum of a suitable scalar reward function. In contrast to current RL methods, humans are able to learn new skills with little or no reward by using various forms of intrinsic motivation. We propose a goal randomization method that uses random basic goals to train a policy in the absence of the reward of environments. Specifically, through simple but effective goal generation, our method learns to continuously propose challenging -- yet temporal and achievable -- goals that allow the agent to learn general skills for acting in a new environment, independent of the task to be solved. In a variety of text-based games, we show that this simple method results in competitive performance for agents. We also show that our method can learn policies that generalize across different text-based games. In further, we demonstrate an interesting result that our method works better than one of state-of-the-art agents GATA, which uses environment rewards for some text-based games.

PDF Abstract