no code implementations • 7 Dec 2021 • Xingxing Liang, Yang Ma, Yanghe Feng, Zhong Liu
In addition, by analyzing the heatmap of priority changes at various locations in the priority memory during training, we find that memory size and rollout length can have a significant impact on the distribution of trajectory priorities and, hence, on the performance of the algorithm.
no code implementations • 24 Dec 2018 • Xingxing Liang, Qi. Wang, Yanghe Feng, Zhong Liu, Jincai Huang
Recent breakthroughs in Go play and strategic games have witnessed the great potential of reinforcement learning in intelligently scheduling in uncertain environment, but some bottlenecks are also encountered when we generalize this paradigm to universal complex tasks.