Research On AUV Motion Planning Method Based On Maximum Entropy Deep Reinforcement Learning

Posted on:2023-01-31

Degree:Master

Type:Thesis

Country:China

Candidate:X Yu

Full Text:PDF

GTID:2530306905469404

Subject:Ships and marine structures, design of manufacturing

Abstract/Summary:

PDF Full Text Request

This research explores how autonomous underwater vehicle(AUV)can rely on global path information and local information obtained by sensors to make decisions efficiently and quickly in an unknown and complex environment,so as to avoid dense obstacles with different shapes,reach the specified target location and complete the motion planning task while meeting various constraints.Aiming at the problems of poor exploration ability,single strategy,high training cost and sparse reward environment in AUV motion planning task,an end-to-end motion planning system based on deep reinforcement learning algorithm is proposed.In order to solve the above problems and improve the effect of AUV motion planning,the following contents are studied:(1)Considering the constraints of system dynamics,sensor performance,obstacle collision range and ocean current interference,the complex motion planning problem is formulated.Based on the neural network model,the end-to-end motion planning architecture based on state information action output is constructed,and the state space based on position information,speed information and obstacle information is determined.At the same time,a simple sonar model is built to realize local obstacle avoidance,and the problem of sonar dead zone is studied.Then the action space of AUV is determined,and the action value output by neural network is clipped and linearly transformed.(2)A motion planning system based on the soft actor-critic(SAC)algorithm is designed,and the maximum entropy method is used to increase the randomness of the strategy,thereby enhancing the AUV’s ability to explore the environment.Aiming at the problem of sparse environmental reward,the motion planning task is decomposed,and a comprehensive external reward function is designed,which can guide the AUV to approach the target point,while constraining its navigation state and optimizing the navigation distance and time.(3)Aiming at the difficulty and time-consuming problem of learning a strategy from scratch in reinforcement learning,the method of generative adversarial imitation learning(GAIL)is introduced to assist AUV training,and expert strategies are used to guide the learning of AUV.Furthermore,a combination algorithm of SAC-GAIL is proposed.The algorithm is trained by mixing GAIL internal reward signals with external reward signals,which reduced the cost of interaction between AUV and the environment.By coordinating the weights of internal and external rewards,the GAIL reward signal will guide the AUV to navigate and encourage it to discover external environmental rewards.(4)Based on the visual simulation of Unity3 D software,this research constructs the randomly distributed dense obstacle environment,determines the episode termination judgment process in the training process,and selects appropriate reward value and algorithm parameters.For the tasks of single target point and multi-target point,the motion planning system based on PPO,SAC and SAC-GAIL algorithms are trained respectively,and the training results are analyzed.Based on the strategy obtained by training,the target point sequence is randomly generated,and several algorithms are tested and compared.Finally good results are obtained,which verifies the effectiveness and stability of the algorithm and reflects the advantages of the algorithm.

Keywords/Search Tags:

Autonomous Underwater Vehicle, Motion planning, Obstacle avoidance, Deep reinforcement learning, Soft Actor-Critic, Generative adversarial imitation learning

PDF Full Text Request

Related items

1	Brain-inspired Decision-making Method For Rapid Obstacle Avoidance Task Of UAV
2	Research On Ship Path Planning Based On Actor-Critic Algorithms
3	Generation Of Adaptive Decision-making Ability Of Agents Based On Deep Reinforcement Learning
4	Researching On Pipeline Following Control Of Autonomous Underwater Vehicle Based On Deep Reinforcement Learning
5	Research On Autonomous Driving Human-like Car-following Decision Algorithm Based On Deep Reinforcement Learning
6	Automatic Reconstruction Method Of 3D Geological Models Based On Generative Adversarial Networks With Soft And Hard Condition Data
7	Research On Underwater Robot Navigation Algorithm Based On Deep Reinforcement Learning
8	Research On Path Planning Of Unmanned Underwater Vehicle Based On Value Uncertainty Reinforcement Learning
9	Research On Autonomous Driving On Highway Roads Based On Reinforcement Learning And Vehicle Dynamics
10	Research On Automatically Medical Image Diagnosis Report Generation Algorithm Based On Generative Adversarial Networks