Font Size: a A A

Adaptive Dynamic Programming Optimal Control For Discrete Nonlinear Systems With Unknown Model And Constraints

Posted on:2024-11-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:S J SongFull Text:PDF
GTID:1528307373470894Subject:Mechanical engineering
Abstract/Summary:PDF Full Text Request
Adaptive Dynamic Programming(ADP)is an intelligent control approach that integrates dynamic programming,reinforcement learning,and function approximation.Because it successfully addresses the “curse of dimensionality,”ADP has been widely regarded as an effective method for solving optimal control problems of nonlinear systems since it was proposed.In recent years,both the academic and industrial communities pay attention to the transition of ADP from theoretical research to practical applications.However,to achieve this significant leap,there are still some theoretical issues with practical significance that need to be addressed.Among these,the optimal control of nonlinear systems with unknown model and constraints is a typical problem.Based on this,this dissertation starts from the optimal control problem of discrete nonlinear systems with completely unknown models,progressing from unconstrained systems to constrained systems,and investigates the ADP-based optimal control method for discrete nonlinear systems with unknown models,asymmetric input saturation,and state constraints.The main findings of this dissertation are as follows:1.For unknown discrete nonlinear systems,a hybrid data-driven value iteration ADP algorithm is proposed to solve the optimal control problem.Firstly,the off-policy learning is introduced into the the state value function-based ADP,enabling the utilization of real data for learning while improving the robustness of the algorithm and the efficiency in data utilization.Furthermore,by introducing deterministic data learning mechanism,the approximation errors of the iterative control policy during the iteration process are avoided.Additionally,by combining off-policy learning,deterministic data learning,and model neural network,the model errors during the iteration process are reduced,and the requirements on the generalization capability of the model neural network is relaxed.At the theoretical level,a novel convergence analysis method for value iteration ADP is presented under consideration of model approximation errors,and further analysis of the stability of closed-loop systems under the implement of the obtained controller is given.2.For unknown discrete nonlinear systems,a data-driven value iteration ADP algorithm based on error cost is proposed to solve the optimal tracking problem.Firstly,a new cost function is introduced,which not only theoretically guarantees the complete elimination of tracking errors but also avoids the model errors introduced by estimating expected control.Secondly,for the new cost function,to address the dependency of the utility function on system models,a completely data-based utility function is designed,achieving that the utility function is model-free.Subsequently,integration with the hybrid data-driven ADP framework proposed in this dissertation improves the robustness of the algorithm to learning data.At the theoretical level,new methods for stability analysis of value iteration tracking control are provided under the conditions of non-definite utility functions and the influence of modeling errors.3.For unknown discrete nonlinear systems with asymmetric input saturation,a value iteration ADP algorithm based on an improved penalty function is proposed to address the optimal tracking problem.Firstly,a new penalty function is designed.In comparison with ADP algorithms based on traditional penalty functions,the ADP algorithm designed based on the new penalty function exhibits smaller deviations between the obtained control policy and the optimal control policy.Secondly,by applying this penalty function to the utility function of the ADP algorithm in the form of an amplification coefficient,the proposed penalty function can be used for solving optimal tracking problem.Finally,integration with the proposed data-driven ADP framework enables optimal tracking control of discrete nonlinear systems with unknown models and asymmetric input saturation.4.For unknown discrete nonlinear systems with both input saturation and state constraints,a value iteration ADP algorithm based on safety policy improvement is proposed to solve the optimal tracking problem.Firstly,the safety control space of the contrained system is defined,providing a formal unification of state constraints and input saturation.Secondly,to address the issue of unknown boundaries of the safety control space,a space compression safety search approach is proposed,ensuring that the control inputs computed by the policy improvement will not violate the constraint conditions.Then,the above two steps are combined with traditional policy imporvement to form a safety policy improvement mechanism,which can simultaneously handle state constraints and input saturation.Furthermore,a value iteration ADP algorithm framework based on this mechanism is provided.Finally,integration with the data-driven ADP framework proposed in this dissertation enables model-free optimal tracking control of discrete nonlinear systems with both input saturation and state constraints.
Keywords/Search Tags:Adaptive Dynamic Programming, Optimal control, Discrete-time nonlinear system, Neural network, Value iteration
PDF Full Text Request
Related items