Font Size: a A A

Research On Vehicle Path Planning Based On Deep Reinforcement Learning

Posted on:2022-03-19Degree:MasterType:Thesis
Country:ChinaCandidate:S M GuFull Text:PDF
GTID:2492306341469124Subject:Traffic and Transportation Engineering
Abstract/Summary:PDF Full Text Request
The intelligent development of vehicle technology can not only enable the vehicle to reach the destination in less time,and it can effectively avoid all kinds of traffic accidents when the vehicle is driving.In the process of the intelligent development of vehicle technology,its core development content is vehicle path planning technology.At present,the traditional path planning technology has problems in the optimization process,such as slow convergence speed,inability to handle continuous tasks,and easy to fall into local optimal solutions,causes the vehicles in the traffic for path planning efficiency is too low,so this article in view of the vehicle path planning problem,research vehicle path planning strategy based on depth of intensive study.The main research work of this paper is as follows:(1)Aiming at the traditional Q-Learning algorithm,it is easy to fall into the local optimal solution and the convergence speed is slow in the vehicle path planning,the dynamic exploration factor technology is introduced,and an improved algorithm ε-Q-learning is proposed.The size of its exploration factor ε changes dynamically.If a vehicle path exploration from the starting point to the end point fails,the randomness of the next exploration can be increased by increasing ε,so as to avoid getting caught in the previous local optimization.On the contrary,reducing ε can increase the purpose of vehicle path exploration,so that the vehicle’s exploration of the current optimal path is more directional,improve the exploration efficiency and not easy to fall into the local optimal solution.Based on Aanconda software,this paper constructs a simulation experiment environment for vehicle path planning,and compares and evaluates the performance of ε-QLearning and traditional Q-Learning algorithms.Experimental results show that ε-Q-Learning improves vehicle path planning efficiency by 12.5% compared to Q-Learning,and the vehicle path obtained by combining ε-Q-learning algorithm is also better than the vehicle path obtained by combining the Q-learning algorithm.(2)To solve the problem that reinforcement learning cannot handle continuous tasks and overestimate Q value in vehicle path planning,TD3(Twin Delay-Deep Deterministic Policy Gradient Algorithm)algorithm is introduced to solve the problem of vehicle path planning.It can face the tasks with continuous action space and continuously suppress Q value.In this paper,three indexes of loss function,cumulative return and optimal path were used to complete the corresponding experimental research,and a simulation experimental environment was built based on Jupyter Notebook.Because the data used in this paper is of low dimension and relatively simple,so the use of neural network is no longer a convolutional neural networks,but use the connection part of the traffic data of neural network to hangzhou(data from hangzhou city traffic congestion index real-time monitoring platform)for processing,finally TD3 algorithm in path planning were tested,we also use the data set of Q-Learning,ε-Q-learning and TD3 algorithm has carried on the comparative experiment,the experimental results show that TD3 algorithm not only can make the vehicle path planning,Compared with Q-learning and ε-Q-learning algorithms,the cumulative return and loss function are improved by about 51%.
Keywords/Search Tags:Deep reinforcement learning, Neural network, Path planning, Data processing
PDF Full Text Request
Related items