Font Size: a A A

Research On Multi AUV Underwater Information Detection Decision Based On Reinforcement Learning

Posted on:2023-02-05Degree:MasterType:Thesis
Country:ChinaCandidate:J J LouFull Text:PDF
GTID:2532306905470184Subject:Ships and marine structures, design of manufacturing
Abstract/Summary:PDF Full Text Request
At present,unmanned combat mode is widely studied and gradually adopted.Autonomous underwater vehicle(AUV)can explore and develop underwater space and accomplish many underwater tasks excellently.It has a wide range of application prospects and it’ s research value is extremely high.In this paper,multiple AUV systems are used to carry out intelligent decision planning research on cooperative reconnaissance of unknown targets which in underwater.The specific contents are as follows:Firstly,we introduce the background and significance of the research purpose of this paper,and analyze the current research status of common path planning algorithms at home and abroad.In view of the complexity of the multi-AUV system collaborative reconnaissance task,the deep reinforcement learning algorithm is applied to the multi-AUV system decision-making,which effectively improves the multi-AUV system’s ability to detect underwater unknown targets and improves the adaptability to the environment.Next,this paper aims at the problem that the traditional rasterized modeling method cannot be applied to the full coverage path planning mission with the AUV due to the kinematic constraints of the AUV,so we propose a modeling method which is suitable for the mission of multi-AUVs cooperatively reconnoiter the area,And we design the corresponding AUV continuous state space,use the convolutional neural network to fit and output the action of AUV.After that,the convolutional neural network model is trained by the reinforcement learning algorithm.AUV can output continuous actions in the continuous state space and complete the full coverage planning task with the modeling method we proposed.Then,based on the theory of deep reinforcement learning--Soft Actor Critic(SAC)algorithm,this paper improves this algorithm and uses truncated double Q learning network to solve the problem that Q value is overestimated in this algorithm.We use it in the AUV full coverage path planning task,and the simulation experiment indicates that the improved algorithm has better convergence.Aiming at the problem that each AUV faces the dynamic environment because of they continually change its strategy during the training process of the multi-AUVs system,a centralized training and decentralized execution framework is introduced in the SAC algorithm.In order to further reduce the mutual influence between the AUV due to strategy change,the minimax optimization method is added to the learning objective of reinforcement learning.Simulation experiments verify the effectiveness of the algorithm.Finally,in order to improve the reconnaissance ability of the multi-AUVs system and enhance the generalization ability of the system,this paper proposes a game training method.In the training process of the multi-AUVs system,at regular intervals,the target position is obtained by the particle swarm optimization algorithm with the current multi-AUVs system strategy,and then the multi-AUVs system is continue training with the optimal target position.Through the continuous game between the two parties in the training,the intelligence and generalization ability of the multi-AUVs system are finally improved.
Keywords/Search Tags:Autonomous underwater vehicle, Environment modeling, Deep reinforcement learning, Game training
PDF Full Text Request
Related items