Font Size: a A A

Research Of Decision-making Technology For Intelligent Driving Based On Deep Reinforcement Learning Algorithm

Posted on:2021-04-21Degree:MasterType:Thesis
Country:ChinaCandidate:Z G ZhangFull Text:PDF
GTID:2392330647461924Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
Autonomous driving technology has always been an important research direction for scholars.In recent years,the artificial intelligence industry,the rapid development of high-precision radar technology and the commercialization of 5G technology have provided great support for the development of autonomous driving technology.Deep reinforcement learning integrates the perception ability of deep learning and the decision-making ability of reinforcement learning.It is appropriate for autonomous driving,which requires a perception environment and makes driving operation decisions.Therefore,it is significant of taking use of deep reinforcement learning algorithms for decision-making technology research.By analyzing the current situation of unmanned driving technology and deep reinforcement learning technology,according to the limitations of different deep reinforcement learning algorithms,the deep deterministic policy gradient algorithm(DDPG)suitable for continuous actions is selected to learn the decision-making strategy of automatic driving in the open racing car simulator(TORCS).By analyzing the results of the DDPG algorithm,it is obtained that the original algorithm has a slow training speed and an unstable training process.Therefore,a average deep deterministic policy gradient for double imitation algorithm(Average-DDPGf DI)is proposed for this situation.For the slow training speed of the original algorithm,the Average-DDPGf DI algorithm uses an expert controller to guide the learning process of the original algorithm online and offline to improve the learning speed of the algorithm.The expert controller is used to collect expert data and label calibration,and then use the experience pool separation technology to isolate and store expert experience samples,high quality experience samples and low quality experience samples,which together form a complete experience pool.The Critic network guided Actor network to learn policies of decision-making in Actor-Critic structure.We designed and used different loss functions to update the Critic network parameters for different experience samples.The Average-DDPGf DI algorithm designed a reward function that is more in line with road driving for the unstable training process of the original algorithm,considering driving the vehicle on the center line of the road to avoid turning out of the current road.At the same time,it considers that the vehicle should get as much reward as possible while driving in a straight line,decelerating as much as possible when driving on a curve,andpassing safely at low speed.Therefore,the Speed and body position is limited of the Agent.For the overestimation of the evaluation network in the original algorithm,the double evaluation network and the average evaluation network are used to constrain the overestimation.Reduce the update speed of the policy network and the target network to reduce the cumulative error,stabilizing the training process by combining the four.Finally,the algorithm original and improved is tested and analyzed on the TORCS simulation platform.Experimental data shows that: the policy learning speed of Average-DDPGf DI is about double based on the original algorithm;It is obtained that when the average value of 4 historical Q values is taken,the learning process is more stable and the average rewards steadily rises.And the use of the double critic network can increase the effective driving distance of the vehicle by three times or even further.The conclusion accords with the theoretical expectation and confirms the feasibility of our idea.
Keywords/Search Tags:Autonomous driving, Deep reinforcement learning algorithm, Deep deterministic policy gradient algorithm, Average-DDPGf DI algorithm
PDF Full Text Request
Related items