| With the development of industry and technology,the modern nuclear power industry began to enter the era of intelligence.The requirements of the innovation of nuclear power industrial technology are needed more efficient and intelligent,and the requirements of high control performance are req uired more strictly.Nevertheless,high intrinsic nonlinearity and multivariable strong coupling usually exist in the complex nuclear power system.It is difficult to establish an accurate mathematical model,and thus,the traditional adaptive control theory is difficult to obtain satisfactory control quality.As a self-learning optimization algorithm,adaptive dynamic programming algorithm(ADP),combined with neural network(NN),actor-critic algorithm,and dynamic programming algorithm,has received extensive attention since it was proposed.It effectively overcomes the problem of the "curse of dimensionality" in dynamic programming and has an excellent ability in solving the optimal control strategy of nonlinear systems.At the same time,it also effecti vely improves the control accuracy and greatly reduces the control cost.Based on the adaptive dynamic programming algorithm and the nonlinear system model of 2500 MW advanced pressurized water reactor(PWR)nuclear power,in this dissertation,a tracking control method is designed for the 2500 MW PWR reactor power,and the main research work can be briefly presented as follows:For a multi coupling nonlinear discrete-time nuclear reactor system,a quadratic index function is introduced,an off-line tracking control algorithm based on value iteration is given,and a multi-set point tracking controller is given to improve the control quality.Combined with actor-critical network architecture,the tracking control scheme is improved,and the influence of key parameter values on the tracking effect is analyzed.At the same time,the simulation experiment is compared with model predictive control(MPC),which also shows that this algorithm has a strong ability to solve the optimal tracking control problem when the model is unknown.To obtain a better control accuracy and deal with intrinsic jumping parameters problem,an integral reinforcement learning based on policy iteration is designed.Under an initial admissible control law,the optimal tracking control law is obtained through consecutive online iterative optimization of the error control law,which realizes the tracking control of a nonlinear continuous-time nuclear power system and obtains higher control accuracy.The algorithm is compared with the traditional PID control strategy to verify that this proposed algorithm has achieved good experimental results.Based on multi-objective tracking control,aiming at the analysis of nuclear power system,and obtaining typical system operating conditions,a tracking control method based on the adaptive dynamic programming algorithm is designed to realize the real-time switching of multi-operating conditions,output and simultaneously supervise the stability and safety of system state,to satisfy the requirements of nuclear power system participating in power grid peak shaving. |