Font Size: a A A

The Stochastic Optimum Control Of The Interconnected Power Grid For AGG Based On On-policy Reinforcement Learning

Posted on:2014-08-21Degree:MasterType:Thesis
Country:ChinaCandidate:S P ZhangFull Text:PDF
GTID:2252330401458888Subject:Power system and its automation
Abstract/Summary:PDF Full Text Request
After NERC released CPS standards for Automation Generation Control (AGC), thestudy of stochastic optimum control strategies for AGC became a hot topic. Lots ofresearchers put forward kinds of control strategies such as traidition PI control strategy,self-adaptive control strategy, and intelligent control strategy. The classic PI control strategy isnot good enough to meet the requirements of instantaneity, adaptability and robustness formorden control systemes. Reinforcement learning is an important artificial intelligencecontrol method, and it can be divided into two different kinds according to the updateprinciple of strategy—on-policy and off-policy. The CPS controller based on some off-policyreinforcement learning such as Q-learning update the value functions by hypothetical actions,and it would maybe bring the system to a deviated state a long while when encountered somestrong disturbances. This paper presents a stochastic optimal automatic generation controlalgorithm based on SARSA or SARSA(λ) learning under CPS compliance.Firstly, the AGC controller based on SARSA-learning is desigened and the relevantprograms are wrriten too. The most important components of the controller are theenvironmental state S, the reward function R, the probability function P, the actions A and thevalue function Q. This method try to get the optimal policy by a way called try and error‘CPS1and ACE instantaneous values are used to construct the reward function, which updatesbased on historical experience. The result is presented of two different reward functionsthough the simulation study.Secondly, the multi-step backtrack SARSA(λ)-learning algorithm is written and therelevant AGC controller is desigened. Eligibility trace is introduced to redunce the delay timeas there are more thermal plants then hydro plants in the control system. The results of thesimulations show that this method can effectively avoid searching some dangerousperturbation states to get better control performance. In order to make the reinforcementlearning algorithm more convient to appliy in real power system, the pre-learning process ofthe SARSA(λ) controller is substituted by the imitation-learning process.At last, continuous outputs for SARSA-learning controller are realized by outputfunction approximation. That means the power control instructions will be more suitable for real control and the problem of "dimension disaster" can be avoided either.This work is jointly supported by National High-tech Research and DevelopmentProjects (863)(2012AA050209), National Natural Science Foundation of China(51177051);the Fundamental Research Funds for the Central Universities(2012ZZ0020).
Keywords/Search Tags:Automatic generation control (AGC), Control performance standard (CPS), on-policy, SARSA-learning, SARSA(λ)-learning, function approximation
PDF Full Text Request
Related items