The goal of RLCard is to bridge reinforcement learning and imperfect information games, and push forward the research of reinforcement learning in domains with multiple agents, large state and action space, and sparse reward.
Unlike other benchmarks such as the Arcade Learning Environment, evaluation of agent performance in Obstacle Tower is based on an agent's ability to perform well on unseen instances of the environment.
Combinations of Monte-Carlo tree search and Deep Neural Networks, trained through self-play, have produced state-of-the-art results for automated game-playing in many board games.
The game of Go is more challenging than other board games, due to the difficulty of constructing a position or move evaluation function.
Crazyhouse is a game with a higher branching factor than chess and there is only limited data of lower quality available compared to AlphaGo.
While reinforcement learning (RL) has been applied to turn-based board games for many years, more complex games involving decision-making in real-time are beginning to receive more attention.
We introduce the Fever Basketball game, a novel reinforcement learning environment where agents are trained to play basketball game.
ExIt involves training a policy to mimic the search behaviour of a tree search algorithm - such as Monte-Carlo tree search - and using the trained policy to guide it.