Font Size: a A A

Research On Adaptive Dynamic Programming Approach For Optimal Control Of Nonlinear Uncertain Systems

Posted on:2018-01-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:X H CuiFull Text:PDF
GTID:1360330572464554Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
The optimal control of nonlinear systems has attracted more and more consid-eration.The key of optimal control relies on the solution to the Hamilton-Jacobi-Bellman(HJB)equation.Adaptive/Approximate Dynamic Programming(ADP)as a valid method to find the solution to the HJB equation can overcome the com-putational complexity compared with the traditional dynamic programming.ADP algorithm combines with reinforcement leaning,adaptive technique,dynamic pro-gramming theory,neural network,which can solve optimal control problems forward in time and then receives lots of attention.In the dissertation,on the base of ADP,we focus on many optimal control problems,such as finite-horizon optimal control for unknown systems with saturating control inputs,nonzero-sum games with par-tially unknown dynamics and constrained inputs,H? tracking design of uncertain nonlinear systems with disturbances and input constraints,finite-horizon optimal control of unknown nonlinear time-delay systems.The main contents are listed as follows:(1)An adaptive dynamic programming(ADP)-based online integral reinforcement learning algorithm is designed for finite-horizon optimal control of nonlinear continuous-time systems with saturating control inputs and partially unknown dynamics.Moreover,the convergence of the algorithm is proved.Firstly,the control constraints are handled through nonquadratic function.Secondly,a single neural network(NN)with constant weights and time-dependent activa-tion functions is designed in order to approximate the unknown and continuous value function.Compared with the traditional dual neural networks,the bur-den of computation by the single NN is lessened.Meanwhile,the NN weights are updated by the least square method with considering both the residual error and the terminal error.Furthermore,the convergence of iterative value function on the base of NN is proved.Lastly,two simulation examples show the effectiveness of the proposed algorithm.(2)An online optimal learning algorithm based on adaptive dynamic program-ming(ADP)approach is designed to solve the finite-horizon optimal control for multi-player nonzero-sum games with partially unknown dynamics and constrained control inputs.Firstly,it is proved that the online policy itera-tion(PI)algorithm is equivalent to Newton's iteration.Secondly,the single neural networks(NNs)with time-varying activation functions for each player are used to approximate the time-varying solution to the coupled Hamilton-Jacobi-Bellman(HJB)equations in an online and forward-in-time manner.Control constraints are handled through non-quadratic functions.The con-vergence of NN-based online optimal learning algorithm for the multi-player nonzero-sum games is also proved.Finally,a simulation example illustrates the effectiveness of the proposed algorithm.(3)We propose a neural-network(NN)-based online off-policy algorithm to opti-mize a class of nonlinear continuous-time time-delay systems during finite time horizon.The online off-policy algorithm is used to learn the two-stage solu-tion to the time-vaying Hamilton-Jacobi-Bellman(HJB)equation without requiring the knowledge of the time-delay system dynamics.The algorithm is implemented by using an actor-critic NN structure with time-varying ac-tivation functions.The weights of the two NNs ae tuned simultaneously in real-time by considering both the residual error and the terminal error.Two simulation examples demonstrate the applicability of the proposed algorithm.(4)The H? tracking controller is designed for uncertain nonlinear systems with external disturbances and input constraints.A discounted nonquadratic func-tion is introduced,which encodes the constrained input into the H? perfor-mance.The key difficulty for H? tracking control is the requirement to solve the the tracking Hamilton-Jacobi-Isaacs(HJI)equation,which is a partial dif-ferential equation.It is impossible or extremely difficult to solve analytically even in' simple cases.To overcome the difficulty,an online model-free integral reinforcement learning(IRL)algorithm is proposed to learn online in real-time the solution to the tracking HJI equation without requiring any knowledge of system dynamics.To implement it,critic-actor-disturbance neural networks(NNs)are built and the three NNs axe updated simultaneously.Stability and convergence analysis are shown by Lyapunov method.In addition,a robust term is added to the controller to attenuate the effect of the NN approximation errors,which leads to asymptotic stability of the closed-loop systems.Finally,two simulation examples show the effectiveness of the proposed algorithm.(5)A neural network(NN)-based online model-free integral reinforcement learning(IRL)algorithm is developed to solve the finite-horizon H? optimal tracking control problem for completely unknown nonlinear continuous-time systems with disturbance and saturating actuators(constrained control input).An augmented system is constructed with the tracking error system and the com-mand generator system.A time-varying Hamilton-Jacobi-Isaacs(HJI)equa-tion is formulated for the augmented problem,which is extremely difficult or impossible to solve due to its time-dependent property and nonlinearity.Then,an actor-critic-disturbance NN structure-based scheme is proposed to learn the time-varying solution to the HJI equation in real-time without using the knowledge of system dynamics.Since the solution to the HJI equation is time-dependent,the form of NNs representation with constant weights and time-dependent activation functions is considered.Furthermore,an extra er-ror is incorporated in order to satisfy the terminal constraints in the weight update law.Convergence and stability proofs are given based on Lyapunov theory for nonautonomous systems.Two simulation examples arc provided to demonstrate the effectiveness of the designed algorithm.Lastly,some concluding remarks are given.Some unsolved problems and future direction for adaptive dynamic programming are proposed.
Keywords/Search Tags:Adaptive dynamic programming, time delay, neural network, finite horizon, robust control, tracking control, optimal control
PDF Full Text Request
Related items