Self-learning Control Methods Based On Approximate Dynamic Programming And Its Applications

Posted on:2018-09-28

Degree:Master

Type:Thesis

Country:China

Candidate:J H Liu

Full Text:PDF

GTID:2370330623950755

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

Considering the control science and engineering,people tend to design simple and efficient control methods to solve various problems when faced with a variety of dynamic systems.The existing advanced control methods generally need prior knowledge or artificial experience of models of controlled objects in solving nonlinear system problems.Therefore,it could become a tricky question when these methods is used to deal with some uncertain systems.The self-learning control method based on approximate dynamic programming,as a branch of reinforcement learning,can learn the optimal control policy adapted to the current complex dynamic system through observed data.On the other hand,the development of smart vehicles is very rapid,and intelligent vehicle driving technology has become the technological high ground where high-tech companies and traditional automobile enterprises compete for.The vehicle dynamic model will change slightly and become uncertain when it is driving under different condition.Therefore,this paper uses the self-learning control method to solve the motion control problem of the intelligent vehicle.This paper mainly studies the self-learning control method based on approximate dynamic programming,and uses the self-learning control method to solve the tracking problem of the intelligent vehicle.The main achievements and innovations of the paper are as follows:(1)We have proposed a model-free multi-kernel learning control(MMLC)method,aiming at solving the problem of difficult choice of kernel width parameter in singlekernel-based approximate dynamic programming method and the problem of needs of partial model information in classical approximate dynamic programming.This method uses the sparse kernel machines and multi-kernel framework to construct the basis function of the approximate structure.On the one hand,the recursive least squares temporaldifference learning method is used to learn the critic for approximating the optimal value function.On the other hand,the direct gradient descent algorithm is used to update the weights in the neural network,also called actor approximating the optimal control strategy.The multi-kernel method has a more stable performance and a more flexible structure than the single-kernel method,and the robustness to the parameter selection is better as expect.Because the selected multi-kernel learning framework has stronger feature representation ability than the single-kernel methods,contributing a faster convergence to equilibrium.On the other hand,the approximation structure under the multi-kernel framework also enhances the reinforcement learning characteristics,so that the weights between multiple kernels converge during the learning process.It can also be considered that the parameters of the multi-kernel approximation structure will be adjusted adaptively according to the actual data distribution.(2)We have proposed a discrete structure for continuous time systems and a modelfree Actor-critic methods with finite-time receding horizon optimization.It is found form the simulation experiments that the learning efficiency of model-free Actor-critic methods has been a great improvement after finite-time receding horizon optimization.Specifically,the learning success rate has been improved and average strategy learning time is about only 5% of original methods.This confirms the finite-time receding horizon optimization is suitable and give a great upgrade for some learning methods with the low computational complexity and low learning efficiency.(3)We propose a lateral control method for high precision vehicle path tracking based on MMLC.Firstly,the vehicle state transition model of error between the path of vehicle and expected path is established for the lateral control problem of intelligent vehicle,and then the lateral controller is designed by using MMLC algorithm.We use the the way to learn on-policy to update the control strategy,and then learn the vehicle's lateral controller will be tested in the multi-scene built by SIMULINK + PRESCAN platform,compared with three advanced vehicle tracking lateral control methods.The results show that the self-learning controller can obtain better control performance.

Keywords/Search Tags:

Reinforcement learning, Approximate dynamic programming, Multi-kernel learning control, Receding optimization, Motion control

PDF Full Text Request

Related items

1	Optimization Control Methods Based On Approximate Dynamic Programming And Its Applications In Autonomous Land Vehicles
2	Researches On Optimal Control Of Nonlinear Systems Based On Approximate Dynamic Programming
3	Research On Action Control And Decision Based On Reinforcement Learning
4	Research On Adaptive Dynamic Programming Theory For Optimal Control Of Affine Nonlinear Systems
5	Iterative Control Of Fermentation Process With Approximate Dynamic Programming
6	Approximate dynamic programming for anemia management
7	Research On Quantum System Control Based On Reinforcement Learning
8	Researches On Approximate Optimal Control Of Nonaffine Nonlinear Systems Based On Neural Networks
9	Application Of Approximate Dynamic Programming And Reinforcement Learning To Lost-Sales Models And Perishable Product Models
10	Researches On Optimal Control Based On Approximate Dynamic Programming And Its Application In Power System