Font Size: a A A

Policy Iteration Algorithm And Optimization Design For A Class Of Markov Jump Systems

Posted on:2020-12-09Degree:MasterType:Thesis
Country:ChinaCandidate:M G ZhangFull Text:PDF
GTID:2428330575963139Subject:Engineering
Abstract/Summary:PDF Full Text Request
In this paper,the adaptive optimization control algorithm for a class of Markov jump systems is studied in depth.As a hybrid dynamic system,the Markov jump system has a certain jump rule for the transition and switching between its internal subsystems.The Markov jump system contains two dynamic mechanisms,state and modal,which result in a certain coupling relationship between the jump system subsystems due to the switching rate.Based on this,this paper introduces a subsystem decomposition technique for decoupling the coupling relationship between subsystems.Based on the subsystem decomposition technology,the adaptive optimization control algorithm of Markov jump system is further studied.On the other hand,the reinforcement learning algorithm,as an intelligent learning algorithm,can be used to solve the optimal algorithm of linear/linear system optimization control problems online,and has obtained many good research results.In the research of adaptive optimization control for jump systems,the existing results are more offline methods.Even if there is an online solution,the solution process is related to the system model,that is,the internal dynamic matrix information of the system needs to be known in advance.Based on this,this paper proposes a new reinforcement learning algorithm,which is based on online learning and iterative algorithm to solve the optimal controller that satisfies the given performance index of Markov jump system.The specific research contents are as follows:(1)The optimal control problem for a class of continuous time Markov jump systems is studied.Based on the subsystem decomposition technique,an online reinforcement learning algorithm is proposed.Firstly,the Bellman optimization criterion is used to transform the optimal control solution problem of the hopping system into a set of Riccati equations with coupling relationship.The adaptive optimization controller of the hopping system is obtained online by using the reinforcement learning algorithm by equivalent conversion of the Riccati equation.In this solution process,the relevant information of the internal dynamic matrix of the system does not need to be predicted in advance,that is,the algorithm is independent of the system model.Then,the convergence proof of the designed algorithm is given,and the correctness and applicability of the algor:ithm are verified by simulation results.(2)The two-layer zero-sum control problem for a class of continuous-time Markov jump systems is studied.Using the proposed new reinforcement learning algorithm,the controller that satisfies the zero-sum control strategy index is solved,and the convergence and feasibility of the designed algorithm are proved.(3)The problem of non-zero and differential feedback Nash control for a class of continuous time Markov jump systems is studied.The Riccati equation which simplifies the coupling relationship of the two-input system is transformed by the algorithm formula.Combined with the subsystem decomposition technique and the off-line strategy iterative algorithm,the non-zero and differential feedback Nash controllers of the Markov jump system are solved.The iterative process of the algorithm is mainly divided into two steps:policy iteration and policy update Numerical simulation also validates the effectiveness of the proposed algorithm.Finally,a general summary of the optimal control algorithm designed in this paper is given,and the problems to be solved and the specific research directions in the future are pointed out.
Keywords/Search Tags:Markov jump system, reinforcement learning algorithm, adaptive optimization control, Riccati equation, non-zero and differential feedback
PDF Full Text Request
Related items