| Due to the huge benefits of inter-regional power grid in optimizing resource allocation and promoting consumption of renewable energy,scale of the power gird expanded continuously and operation form of the power gird become increasingly complex.The traditional dispatching technology based on experience and analysis cannot meet the development needs of inter-regional power grid,and it is necessary to study the dispatching method which can provide reliable and suitable policy for dispatching optimization of inter-regional power grid(DOIRPG).How to deal with the uncertainties of power grid which caused by the large-scale integration of renewable energy,and how to tap the potential benefits of power grid which brought by the dispatching of flexible load has become an important topic in the study of automatic dispatching technology.Hence,this thesis mainly did the following work:Firstly,the architecture of inter-regional power grid and characteristics of constituent parts are introduced.This thesis has introduced an architecture of inter-regional power grid that includes wind generating units,photovoltaic generating units,thermal generating units,rigid load demand,flexible load demand,and direct current tie-lines.The uncertain models of wind generating,photovoltaic generating and load demand are respectively established based on probability distribution function,interruption and rebound model of flexible load are established at the same time according to the full-time control and 3-stage rebound strategy.Besides,the constraint models of thermal generating units and tie-lines are established based on the actual operating constraints of the inter-regional power.Secondly,the discrete Markov decision process(DTMDP)model for the problem of DOIRPG was established and learning optimization algorithm was given.This thesis has presented the discretization method of continuous variables in the problem of DOIRPG.And the DTMDP model of the problem of DOIRPG was established based on the following aspects: system state and state space,action and action sets,transition probability and transition process,reward function and optimization goal.Algorithm theory and specific steps of the multi-agent hierarchical Q-learning for solving the problem of DOIRPG are given.Besides,this thesis has analyzed the task characteristics of DOIRPG,proposed a method to measure the similarity between tasks of DOIRPG,and given the algorithm theory of knowledge transfer and the specific process of transfer learning.Finally,the specific methods of knowledge transfer for the problem of DOIRPG were designed and the simulation was conducted.This thesis has designed a method of single-source knowledge transfer for the problem of DOIRPG.Effectiveness of the single-source transfer method and rationality of the similarity measurement method were verified by the experimental results,and the effect on optimization speed of different weight parameters was analyzed in this thesis.Subsequently,this thesis continued to design some different methods of multi-source knowledge transfer for the problem of DOIRPG.Effectiveness of the multi-source transfer methods was verified and the effect on optimization speed of different methods was also analyzed in this thesis. |