Font Size: a A A

Research On AUV Path Planning Based On Deep Reinforcement Learning

Posted on:2021-04-28Degree:MasterType:Thesis
Country:ChinaCandidate:S W ZhangFull Text:PDF
GTID:2492306047482284Subject:Ships and marine structures, design of manufacturing
Abstract/Summary:PDF Full Text Request
The vast sea area is abundant with resources,and is also the main way for trades and cultural exchanges.In the context,it is necessary for China to make great efforts to develop marine equipment so as to enhance China’s marine development and management capabilities.Autonomous Underwater Vehicle(AUV),as an autonomous aerial vehicle for exploring and developing underwater space,is featured with small size,high speed,intelligence,good stealth,and zero causality,and can complete scientific investigation and engineering tasks excellently,so AUVs have wide application fields and are worth of research.Taking AUVs as the main research object,this report investigated AUVs’ strategy of path planning during autonomous navigation,and delved into the obstacle avoidance in both a static environment and a dynamic environment.The specific content was as follows:Firstly,the background and significance of this report were introduced,and the current research status of common path planning algorithms at home and abroad was analyzed.The reinforcement learning algorithm was applied to the AUV path planning to accommodate the complexity and variability of the marine environment and to enhance AUV s’ environmental adaptability and self-learning ability.Secondly,based on Actor-Critic algorithm,the research of AUV obstacle avoidance in a static environment was carried out.To overcome the shortcomings of the algorithm in terms of convergence speed,an adjustment strategy of adaptive learning rate was proposed in the paper:considering the differences in learning rates in different dimensions,the learning rate of current time step was dynamically updated based on the information of cumulative gradient descent and current gradient descent.The results from training with homemade training maps showed that the learning rate adjustment strategy effectively improves the convergence speed of the original algorithm,and the optimized algorithm also improves the convergence stability.Then,a actor-multi-critic deep reinforcement learning algorithm was proposed to overcome problems associated with oscillating amplitudes and low learning efficiency in the early stages of training which are common in traditional actor-critic algorithms.Multiple critic evaluate different actions of AUV from different aspects,and then generate comprehensive evaluation results through information fusion.This helps reduce the impact on coupling of multiple behaviors.Through qualitative and quantitative comparative analysis of the experimental results,it is proved that the algorithm has the ability of online learning and can improve learning efficiency,which meets the needs of real-time and adaptability for AUV dynamic obstacle avoidance.Finally,the concept and significance of the semi-physical simulation platform as well as the software and hardware structure of the path planning system developed were briefly introduced.The semi-physical simulation test process and the experimental results of the two algorithms in semi-physical simulation were described,proving the reliability and real-time performance of the proposed AUV path planning method.
Keywords/Search Tags:AUV, unknown environment, local path planning, deep reinforcement learning
PDF Full Text Request
Related items