| Path planning is one of the key technologies for autonomous and intelligent autonomous vehicles.Due to the diversification of the actual environment,path planning algorithm needs to have high adaptability.Therefore,this paper adopts Deep Reinforcement Learning(DRL)algorithm with high adaptability to the environment to study the path planning of the unmanned vehicle.Firstly,path planning of static environment is realized based on DQN(Deep Q-Learning Network)algorithm;Secondly,path planning of dynamic environment is realized by A3C(Asynchronous Advantage Actor-Critic)algorithm;Finally,the previous two algorithm frameworks are combined to realize the experimental study in complex continuous environment based on the Deep Determining Policy Gradient(DDPG)algorithm.The main research contents of this topic are as follows:(1)Research on static path planning based on DQN algorithmAiming at the classic DQN algorithm to realize unmanned vehicle will appear in the path planning problem such as poor exploration ability and the training time is too long.First,by simplifying the incentive function of state space and design algorithm,improve the efficiency of neural network training and explore the ability of the algorithm;the second by establishing grid map simulation experiments of different size.Simulation results show that the improved algorithm not only performs well on small maps,but also has higher training efficiency and robustness when the amount of environment states is large.(2)Research on dynamic path planning based on A3 C algorithmAiming at the problem of dynamic obstacle avoidance of unmanned vehicles in dynamic environment,this chapter is based on the framework of Actor-critic algorithm and uses A3 C algorithm to study the path planning in dynamic environment.In order to better deal with dynamic problems,a Neural Network model is built by combining Rerrent Neural Network(RNN)with full connection layer,and the model was trained in a multi-threading way.Finally,the simulation experiment was carried out by building a dynamic grid environment.The experimental results show that,This method can effectively avoid obstacles and obtain a collision-free path.(3)Research on path planning based on complex continuous spaceFor the path planning experiment of unmanned vehicle in complex continuous environment,DDPG algorithm has slow convergence speed and training efficiency.This chapter improves the exploration efficiency of the algorithm model by designing the reward function and adjusting the exploration strategy of the algorithm.At the same time,in order to get closer to the actual situation,the simulation environment of this chapter is TORCS simulator.The simulation results on the simulator show that the algorithm model can converge quickly to complete the path planning of the unmanned vehicle.Finally,the real vehicle experiment is carried out to verify the robustness of the algorithm. |