Font Size: a A A

Research On Approximate Dynamic Programming Theory Based On Neural Networks And Its Applications

Posted on:2012-05-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:L L CuiFull Text:PDF
GTID:1220330467981067Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Optimal control theory is an important part of the modern control theory, and the range and depth of its research has made great progress. However, it remains many important theoretical problems worth to be studied. Approximate dynamic programming, as a new approach to solve the optimal control problem without the "curse of dimensionality", has gained much attention from a lot of researchers. On the other hand, neural network-based adaptive control has always been considered as an effective method to solve the control problems of complex nonlinear, uncertain and unknown systems, which is always one of the research hot topics of control field. Therefore, combing the merits of two methods above,the approximate dynamic pro-gramming theory based on neural networks becomes a novel research subject, which will not only has important theoretical meaning and academic value, but also has ex-tensive application prospect. By employing the approximate dynamic programming theory based on neural networks, this dissertation makes the further research on the tracking control of mechanical systems, the adaptive critic design of continuous-time nonaffine nonlinear systems, the optimal tracking control of unknown continuous-time nonlinear systems, the optimal stabilization control of discrete-time nonlinear systems with saturation actuator, the zero-sum games of discrete-time linear sys-tems, and the non-zero-sum differential games of continuous-time nonlinear systems, which provides new ideas and new results. The main contents of the dissertation can be briefly described as follows:1. A novel control scheme is proposed for the tracking problem of mechanical systems in the presence of external vibrations and friction. First, a neural network (NN) with accelerometer signal as its input is utilized for feedforward compensation of external vibrations, and then another NN is introduced to compensate for the unknown system dynamics thereby compensate for friction. In contrast to the uniformly ultimately bounded results derived by typical NN-based controllers, the semi-global asymptotic stability results is obtained by integrating the robust integral of the sign of the error (RISE) feedback term into the control input. In particular, exact models of the plant, external disturbances and the accelerometer are not needed. 2. A novel NN-based robust adaptive critic design (ACD) is proposed for a class of continuous-time nonafine nonlinear systems. In contrast to conditional ACD methods, an action NN is employed to approximate the derived unknown uncertain term instead of the nonaffine nonlinear function, thus the ACD method is firstly extended to the nonaffine nonlinear systems. Furthermore, considering the case of the systems with unknown control direction, a fuzzy wavelet networks based robust ACD is proposed. Two fuzzy wavelet networks (FWNs) are employed to implement the control element and the critic element, and the weights, the dilation and translation parameters are tuned online. Lyapunov theory is utilized to derive the novel tuning laws for the NN weights and the adapted parameter and to prove the uniform ultimate boundedness of all signals of the closed-loop system.3. For the first time, a novel data-driven robust approximate optimal tracking control scheme is proposed for a class of unknown continuous-time nonlin-ear systems by using adaptive dynamic programming (ADP) method. In the proposed control scheme, the measurable input/output data is required only instead of the known system dynamics. At first, a data-driven model is es-tablished by a recurrent neural network to reconstruct the unknown system dynamics using measurable input/output data. Then based on the obtained data-driven model, the ADP method is utilized to design the approximate op-timal tracking controller. Finally, by using Lyapunov approach, it is rigorously proved that the proposed control scheme can guarantee that the tracking error asymptotically converges to zero and the obtained control input is close to the optimal control input within a small bound.4. An iterative algorithm based on GHJB method is proposed to seek the optimal saturated control for a class of nonlinear discrete-time systems with actuator saturation. First, a novel nonquadratic functional is introduced to deal with the control constraints of nonlinear discrete-time systems, and the correspond-ing constrained GHJB (C-GHJB) equation and constrained HJB (C-HJB) equation are derived. Then, a novel iterative algorithm is proposed based on the C-GHJB equation to solve the C-HJB equation, and its convergence is proved rigorously. Finally, a neural network is employed to approximate the cost function for implementing the proposed iterative algorithm.5. A novel data-based adaptive critic design (ACD) is proposed for the zero- sum games of discrete-time linear systems. First, the data-based system state equation is derived based on the measurable input/output data. Further, the data-based performance index function and optimal control policies are derived, and the uniqueness of the optimal control policies is also proved. Then a novel data-based iterative ACD algorithm is proposed to obtain the saddle point of the zero-sum games. In particular, in the proposed ACD method, not only the exact model of the system are not needed, but also the full states of the system are not required to be measurable.6. For the first time, a near-optimal control scheme is proposed to solve the non-zero-sum differential games of continuous-time nonlinear systems based on single network adaptive dynamic programming (ADP). In the proposed control scheme, only one critic network is used for each player instead of the action-critic dual network used in a typical ADP architecture. Furthermore, the novel tuning laws for the critic networks are proposed, which can not only guarantee the cost functions reaching the Nash equilibrium of non-zero-sum differential games, but also guarantee the uniform ultimate boundedness of all signals of the closed-loop system without the requirement of the initial stabilizing control policies.Finally, concluding remarks are given. Some unsolved problems and develop-ment direction for the approximate dynamic programming theory based on neural networks are proposed. Furthermore,the prospects of the further study are given.
Keywords/Search Tags:Optimal control, neural networks, approximate dynamic pro-gramming, adaptive critic design, adaptive control, nonaffine nonlinear systems, tracking control, zero-sum games, non-zero-sum games, data-driven
PDF Full Text Request
Related items