| Chemical process control methods are very important for the stable operation of chemical processes to maximize economic benefits.With the increasing scale of chemical production,the complexity of production methods,and the diversification of chemical products,traditional process control methods face challenges such as difficulties for modeling,optimization and computational performance.For these problems,intelligent control based on artificial intelligence,is believed to be an effective solution.Therefore,it is of great significance to develop intelligent control technology for chemical processes.Reinforcement learning(RL)is an artificial intelligence technology based on the principle of optimization.It can learn by interactions so as to achieve the optimal control and decision-making for complex systems.Recently,with the development of artificial intelligence and the parallel computing technology,RL combined with deep neural networks has defeated top human experts in the game of Go and reached the control performance of human in some video games,so it has been attracting widespread attentions.Researchers have gradually extended the research and application of RL to the field of process control,and believe that RL will have a significant impact on the f eld of chemical process control,or more generally,the fields of process operations.However,it is found that the application of the existing RL algorithms for the chemical processes is still challenging.This article focuses on three issues:the first is low sampling(training)efficiency of RL which lead to the resource cost unacceptable and online application impossible,the second is that it is difficult to deal with large time delay that is inevitable in the chemical processes,and the third is many RL algorithms do not make effective use of prior knowledge of the process,resulting in long training and bad robustness.Aiming at the issue of low sampling efficiency and difficulty to deal with large time delay in existing RL algorithms,model predictive control(MPC)guided reinforcement learning control scheme(MP-RLC)is developed in this thesis.On the one hand,this scheme uses MPC to guide RL’s training to improve sampling efficiency,on the other hand,the predictive model in MPC can be used to assist RL to deal with large time delays.Through the numerical simulation on the linear system with a large time delay,it is verified that the proposed scheme has improved the sampling efficiency,which shortens the training time of RL and obtains a better control performance over MPC and basic RL scheme.For the issue of less use of the prior knowledge in RL algorithms,an pretrain method by imitation learning(Model Predictive ControlBehavior Cloning,MPC-BC)is proposed based on the MPC.This method imitates the control action of MPC in an offline manner.The simulation results on a simple linear system shows that MPC-BC has provided an acceptable initial controller for the RL training,thus reduced the magnitude of exploration and accelerated the training phase.In order to verify the practicability of the above scheme,the simulation test was carried out on a typical chemical processes continuous stirring tank reactor(CSTR).Through the temperature control of CSTR,the feasibility and advancement of the proposed RL control scheme were verified.For the more complicated concentration control task,the applicability of the proposed control schemes was demonstrated and further improved by combining Hindsight Experience Replay method,and the effectiveness of MPC-BC pretrain method was verified.The methods proposed in this paper are expected to provide beneficial reference for the research and practice of RL technology applied to the actual chemical industry processes. |