| Autonomous underwater vehicle(AUV),as an important underwater exploration tool,is one of the key development directions of countries around the world.Due to the particularity and complexity of the underwater environment,the autonomy of the AUV is the key to its intelligence.To achieve autonomous navigation and autonomous operation of the AUV,path planning technology plays an important role in it.The level of path planning determines the level of AUV autonomy to a certain extent.Therefore,it is of great practical significance to explore a path planning method adapted to complex environments.Reinforcement learning is currently one of the most promising artificial intelligence methods,which can optimize system performance through continuous interaction with the environment.Therefore,reinforcement learning has a strong ability to adapt to the environment.In this paper,based on the characteristics of reinforcement learning,it is applied to the path planning of AUV to improve the adaptive ability of AUV to underwater environment.In the actual operation,the AUV movement process can be divided into two processes: the diving process and the horizontal movement process during the depth determination operation.Therefore,the main research contents of the paper are as follows:Firstly,the principle of reinforcement learning is analyzed in detail,including the model,elements and algorithm of reinforcement learning.In the context of AUV path planning,the problems needing attention in the practical application of reinforcement learning are analyzed.Secondly,Q learning in reinforcement learning is applied to local path planning in AUV depth determination.The simulation sensor is designed according to the characteristics of the AUV forward vision sonar,and the AUV environment training field is built on the basis of it,which provides an effective verification platform for the design of the next path planning.To solve the problem of slow learning convergence speed of Q learning,the qualification trace technology is used to accelerate it.A path planning method based on improved Q learning is designed.Finally,the designed AUV path planning method is simulated and verified.Thirdly,aiming at the shortcomings of tabular Q learning in the face of continuous space,this paper proposes to apply deep Q learning to the path planning of AUV.Based on the analysis of the basic realization process of neural network,the network structure design of deep Q learning is carried out,and the priority replay buffer structure is used to improve the efficiency of the algorithm,and the AUV path planning method based on deep Q learning is designed.Finally,the proposed method is simulated and verified.Finally,the improved rocket-exploring Random Tree algorithm is used to conduct three dimensional path planning for AUV,so that AUV can descend to the specified depth for operation.Aiming at the shortcomings of RRT algorithm,an improved RRT algorithm was proposed,which introduced the self-learning ability of reinforcement learning into RRT algorithm to approximate search for the nearest neighbor nodes,and added target bias strategy into the algorithm to improve the convergence speed of RRT algorithm.In order to solve the problem of redundant nodes in the path,a "reverse smoothing method" is proposed.Finally,the proposed method is verified by simulation. |