| Unmanned underwater vehicle(UUV)is a kind of highly intelligent unmanned equipment,which usually carries out special tasks in complex and unfamiliar waters.In this mission environment,there are a variety of factors that threaten navigation safety,such as static factors such as a large number of islands,reefs,mines and so on,dynamic factors such as moving ships,random current interference and so on.According to the above factors,the mission environment can be summarized as unknown waters with complex environment.In order to ensure the success rate of the mission and the safety of UUV,the ability of path planning is particularly important.The research content of this paper is to solve the complex water area path planning problem based on value iterative network(VIN),and an end-to-end model is designed by combining machine learning method with reinforcement learning.compared with the traditional numerical method,the planning time is greatly reduced,the dependence on the environmental model is weakened,and the real-time performance is significantly improved.the main contents and progress include the following aspects:1.The complex water area information is processed and used to generate a virtual environment similar to the actual environment,which is used as the interactive environment of UUV.The appropriate path planning method(Voronoi algorithm)is selected to generate the correct path as the sample label.According to the kinematic characteristics of UUV,the improved value iterates the action space of the output of the network,optimizes the full connection layer,and adjusts the reward mapping of the input value function to match the action space of UUV.2.Designed a value iterative network with a dynamic convolution structure,improved the convolution sampling method and initialization method,and made up for the problem of the loss of effective information caused by the large map caused the network depth to be too large,the number of iterations was too large,and the training parameters were introduced Convolution offset.3.To solve the problem that the accuracy of the value iteration network in the dynamic water flow environment is not high,the convolution weight value iteration algorithm combined with the decision tree idea is designed,which realizes that the calculation is performed in the shallow position of the network under the premise that the amount of calculation does not change significantly.Estimate and select the appropriate convolutional layer to process the information network structure.4.Construct a network structure to realize the interaction between UUV and the simulation environment.For the path planning task,targeted evaluation indicators(task success rate,trajectory difference,path safety)are designed,and the improved network is tested with the corresponding parameters obtained through training,and the improvement effect is analyzed through the evaluation indicators.This paper not only improves the mission success rate,but also considers the safety of the navigation route,guarantees the distance between the predicted route and the obstacle,and makes a corresponding design to deal with random current interference.This subject has important theoretical significance for UUV’s autonomous mission capability. |