Reinforcement Learning for Volt-Var Control: A Novel Two-stage Progressive Training Strategy

23 Nov 2021 · Si Zhang, Mingzhi Zhang, Rongxing Hu, David Lubkeman, Yunan Liu, Ning Lu ·

This paper develops a reinforcement learning (RL)approach to solve a cooperative, multi-agent Volt-Var Control (VVC) problem for high solar penetration distribution systems. The ingenuity of our RL method lies in a novel two-stage progressive training strategy that can effectively improve training speed and convergence of the machine learning algorithm. In Stage 1(individual training), while holding all the other agents inactive, we separately train each agent to obtain its own optimal VVC actions in the action space: {consume, generate, do-nothing}. In Stage 2 (cooperative training), all agents are trained again coordinatively to share VVC responsibility. Rewards and costs in our RL scheme include (i) a system-level reward (for taking an action), (ii) an agent-level reward (for doing-nothing), and(iii) an agent-level action cost function. This new framework allows rewards to be dynamically allocated to each agent based on their contribution while accounting for the trade-off between control effectiveness and action cost. The proposed methodology is tested and validated in a modified IEEE 123-bus system using realistic PV and load profiles. Simulation results confirm that the proposed approach is robust and computationally efficient; and it achieves desirable volt-var control performance under a wide range of operation conditions.

PDF Abstract