A Deterministic Policy Gradient Based Load Control Policy in Direct Current Distribution Networks
Abstract— Developing algorithms for global optimum seeking of non-convex optimization problems has
special potential in the real world. Previous researches in this field suffer from resulting a local optimum or
losing some accuracy by convex relaxation. In this paper, we consider a demand side management (DSM)
problem in direct current (DC) distribution networks as an application to study the global optimum seeking of
non-convex optimization. Due to the voltage and network constraints, non-convexity appears in the objective
function taking into account the tradeoff between the operation costs and users' preferences. By the freedom
to express learning problem as a non-convex optimization, we explore a deterministic policy gradient (DPG)
based algorithm to calculate the global optimum. A policy network and a polynomial regression critic are
built to learn the optimal policy under an exploration noise. Numerical results are provided to demonstrate
the DPG algorithm increasing the probability of convergence to the global optimum.
Index Terms— Distribution networks, Demand-side management (DSM), Deterministic policy gradient, Reinforcement learning
Hong Duan, Xu Zhou, Zhongjing Ma
School of Automation, Beijing Institute of Technology, CHINA
Shanxi Information Industry Technology Reasearch Institute Co., CHINA
Cite: Hong Duan, Xu Zhou, Xianhong Kang, Zhongjing Ma, "A Deterministic Policy Gradient Based Load Control Policy in Direct Current Distribution Networks," Proceedings of 2019 the 9th International Workshop on Computer Science and Engineering, pp. 996-1001, Hong Kong, 15-17 June, 2019.