Font Size: a A A

Optimization Control Methods Based On Approximate Dynamic Programming And Its Applications In Autonomous Land Vehicles

Posted on:2017-07-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:C Q LiaFull Text:PDF
GTID:1310330536467144Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
In recent years,approximate dynamic programming(ADP)has been widely used for dealing with various complex optimization decision problems.How to improve the generalization capability and real-time optimization capability of ADP are the basic theoretical research of this paper.On the other hand,under the support of the key project supported by the major research plan of the National Natural Science Foundation,this paper focuses on the lateral control method of autonomous land vehicles(ALVs)based on approximate dynamic programming for different road shapes and vehicle speeds.The main research results and innovations in this paper are as follows:(1)For the problem of low learning efficiency and the difficulty in feature selection in conventional ADP methods,ADP based on sparse kernel machines(KADP)is proposed.Sparse kernel machines are used to construct basis functions and update rules in the critic are based on the recursive least-squares temporal difference(RLS-TD)algorithm.The theoretical analysis shows that the critic in KADP can obtain smaller approximation errors and faster convergence speed because of the favorable representation learning and generalization capabilities of sparse kernel machines.Simulation and experimental results for inverted pendulum systems show that compared with conventional ADP methods,KADP has better control performance and about 30% convergence speed improvement.(2)Approximate dynamic programming with graph Laplacian(GL-ADP)is proposed.Manifold learning is integrated into ADP and the graph Laplacian operator is used to construct basis functions.Update rules in the critic are based on the RLS-TD algorithm.The theoretical analysis shows that although the computational complexity in GL-ADP is usually higher than KADP,but manually selecting the type of kernel functions and empirical parameters is avoided.Simulation results for continuous stirred tank reactor(CSTR)and ball and plate systems show that compared with conventional ADP,GL-ADP has better control performance.In addition,compared with KADP,although GL-ADP has higher computational cost,the convergence speed and control performance have about 18% and 2% improvement,respectively.(3)Receding horizon ADP(RHADP)is presented for discrete time systems.Finite horizon ADP is employed to obtain a closed-loop optimal control policy in each prediction horizon.The convergence property of RHADP and stability of controlled systems have been proved.Moreover,it is shown that the computational complexity of the proposed method is O(N2),while the computational complexity of nonlinear model predictive control(NMPC)using the interior point method as the optimization technique is O(N3L).Simulation results for the trajectory tracking problem of mobile robots and the control problem of Van der pol oscillator show that compared with conventional NMPC using the interior point method as the optimization technique,RHADP has better control performance and lower computational cost.(4)A self-optimization lateral control method for ALVs is presented.First,a Markov decision process(MDP)model for the lateral control problem of ALVs is established.Then a closed-loop optimal policy is obtained for the lateral control problem based on the kernel-based dual heuristic programming(KDHP)algorithm.Because the KDHP algorithm has favorable self-optimization and generalization capabilities,it is beneficial for acquiring high control accuracy for different road shapes and vehicle speeds.In the experiment with about 200 km mileage(including large curvature paths,campus roads,urban roads and highways),the average lateral error is about 0.18 m.Compared with the feedback control method based on preview and kinematic model which has been used in the test car,the proposed method has higher control accuracy under certain conditions.In addition,the ”cutting corners” problem when ALVs execute turns of large curvatures is also avoided.The research results in this paper have been applied on the autonomous driving test car.
Keywords/Search Tags:Reinforcement learning, Approximate dynamic programming, Receding optimization, Automatic driving, Motion control
PDF Full Text Request
Related items