Font Size: a A A

Reinforcement Learning Algorithm Based CPS Order Dynamic Optimal Dispatch Methodology For Interconnected Power Systems

Posted on:2011-04-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y M WangFull Text:PDF
GTID:2132360308464183Subject:Power system and its automation
Abstract/Summary:PDF Full Text Request
The dynamic optimization of automatic generation control (AGC) generating command dispatch is a stochastic optimization problem for the interconnected power system. The North American Electric Reliability Council (NERC) formally released new Control Performance Standards (CPS) for AGC of interconnected power systems under Tie-lines Bias Control (TBC) mode in 1997. CPS standard pays more attention to the medium and long-run returns of AGC performance and change the traditional AGC control philosophy, and how to design the CPS order dynamic optimal dispatch strategy under CPS standards has become a fire-new topic of theoretical research.Firstly, the paper states the CPS order dynamic optimal dispatch principle briefly, and introduces the background, current research status of CPS order dispatch at home and abroad. This paper also gives the mathematical analysis of NARI's CPS control rules. On the basis of in-depth study of CPS order dispatch characteristics and the optimal control objective, the paper suggests that the NERC's CPS based AGC system is a stochastic multistage decision process, and the dispatch problem should be suitably modeled as a reinforcement learning (RL) problem based on Discrete Time Markov Decision Process (DTMDP) theory, and Q-learning method based optimal stochastic control techniques is introduced into the domain of CPS order dispatch for its solution.Secondly, by applying the Matlab/Simulink and DTMDP simulation modeling, the load frequency control (LFC) models of two-area power system and Guangdong power grid are taken as examples for detailed comparison and analysis three Q-learning based CPS order dispatch algorithm. Reward functions in Q-learning are designed based on different optimization objectives. Thermal and hydro units are integrated, with the regulating margin for hydro units being considered, to improve the regulating performance of the AGC system. The multi-step optimization Q(λ) method with the backtracking function is also employed to overcome the problem of long control time-delay in the AGC control loop. The statistical experiment results show the proposed dispatch methodology with online self-learning technique and dynamic optimization capability can obviously enhance the robustness and adaptability of AGC systems while the CPS compliances are ensured. Finally, this paper presents an improved hierarchical reinforcement learning (HRL) algorithm to solve the curse of dimensionality problem in the multi-objective dynamic optimization of CPS order dispatch. The CPS order dispatch task is decomposed into several subtasks by classifying the AGC committed units according to their response time delay of power regulating. A time-varying coordination factor is introduced between layers of HRL to speed up the algorithm by 60%. Numbers of linear combination of weights in reward function are designed to optimize hydro capacity margin and AGC production cost. The application of improved hierarchical Q-learning in the China southern power grid model shows that the proposed method can enhance the performance of AGC systems in CPS assessment and save AGC regulating cost over 4%, compare with the hierarchical Q-learning and genetic algorithm.This paper is supported by National Natural Science Fund of China "AGC Optimal Relaxed Control and its Markov Decision Process based on Control Performance Standards" (50807016), Guangdong Natural Science Foundation Project (9151064101000049) and the Fundamental Research Funds for the Central Universities (No.2009ZM0251).
Keywords/Search Tags:Automatic generation control, Control performance standard, Reinforcement learning, Markov Decision Process, Q-learning, Stochastic optimization
PDF Full Text Request
Related items