Safe deep reinforcement learning-based constrained optimal control scheme for active distribution networks

Reinforcement learning-based schemes are being recently applied for model-free voltage control in active distribution networks. However, existing reinforcement learning methods face challenges when it comes to continuous state and action spaces problems or problems with operation constraints. To address these limitations, this paper proposes an optimal voltage control scheme based on the safe deep reinforcement learning. In this scheme, the optimal voltage control problem is formulated as a constrained Markov decision process, in which both state and action spaces are continuous. To solve this problem efficiently, the deep deterministic policy gradient algorithm is utilized to learn the reactive power control policies, which determine the optimal control actions from the states. In contrast to existing reinforcement learning methods, deep deterministic policy gradient is naturally capable of addressing control problems with continuous state and action spaces. This is due to the utilization of deep neural networks to approximate both value function and policy. In addition, in order to handle the operation constraints in active distribution networks, a safe exploration approach is proposed to form a safety layer, which is composed directly on top the deep deterministic policy gradient actor network. This safety layer predicts the change in the constrained states and prevents the violation of active distribution networks operation constraints. Numerical simulations on modified IEEE test systems demonstrate that the proposed scheme successfully maintains all bus voltage within the allowed range, and reduces the system loss by 15% compared to the no control case.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here