Font Size: a A A

Researching On Pipeline Following Control Of Autonomous Underwater Vehicle Based On Deep Reinforcement Learning

Posted on:2021-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y N LiuFull Text:PDF
GTID:2370330611991178Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the increasing population,the exploitation of resources and the laying of subsea pipelines for oil and natural gas are also increasing.Therefore,the protection of marine environment has been paid attention to.In order to avoid the subsequent damage of marine biological environment,the underwater pipelines need to be maintained regularly.For the detection and maintenance of underwater engineering equipment,there is a potential safety hazard for the traditional manual,so there is an urgent need for the detection method of submarine pipeline for the underwater robot,because of the limitations of manual control in the implementation of the underwater robot with cable.Therefore,foreign scholars have carried out research on autonomous control methods,but most of them need to build dynamic models,which are difficult to obtain in actual operation.Therefore,reinforcement learning has been widely concerned,but it is rarely used in the task of circulation.Therefore,this project chooses the method of the combination of the underwater robot and the deep reinforcement learning to carry out the research on the control strategy of the underwater robot based on the deep reinforcement learning.The specific work is as follows:First of all,a virtual pipeline experiment platform is built to adapt to this task.Because the hardware of the underwater robot is very expensive and the marine environment is bad,it is a dangerous and costly work to use the real robot.So this paper uses openai gym,robot operating system The tool modules of system,ROS)and uwsim are integrated to realize a 3D virtual simulation platform for subsea pipeline detection.The platform can avoid the dangerous problems in the implementation of the scheme,reduce the cost and time-consuming of the experiment,solve the problem of the number of samples needed for reinforcement learning and improve the efficiency of sample acquisition.The same as the platform for pipeline training.Secondly,an end-to-end tracking strategy based on the pixel to action mapping of deep reinforcement learning is proposed.The problem of image-based pipeline tracking for autonomous underwater vehicle(AUV)driven completely on the same level.Most model-based methods cannot solve these problems.Therefore,in this paper,AUV is regarded as a continuous state and continuous action Markov decision process(MDP)to formulate the pipeline problem under the uncertain transition probability.The pipeline circulation strategy is modeled as the velocity mapping from the image generated by the camera to the AUV,and is represented by the depth neural network.Then,the neural network is trained by using the proximal policy optimization(PPO)method,and a pixel to action mapping strategy is obtained.Finally,several experiments are constructed to verify the effectiveness of the proposed method and the generalization ability of learning strategies.The simulation results show that the learning strategy can control the AUV running on the pipeline and has strong generalization ability for new and unknown pipeline geometry.Finally,the task control strategy based on the modified convolutional neural network structure is proposed.Because this research mainly relies on the image acquired by the camera on the bottom of the underwater robot to carry out the circulation task,the image acquired by the camera in the circulation process may be interfered by the underwater light and depth and other factors,resulting in the poor quality of the circulation strategy,and the main part of image processing is convolutional neural network,so the original excitation value and the length of the circulation pipe are affected It may be related to the structure of convolutional neural network.Therefore,considering the structure of convolution neural network,the unnecessary interference factors are removed by adding binary processing method.In addition,because the research content of this topic is only the mapping relationship between image and action,and it is found that the training effect of small-scale neural network structure is better in the experimental process.Through improving the strategy,the optimization of the strategy of the circulation task is realized.
Keywords/Search Tags:Pipeline following task, Reinforcement learning, Proximal Policy Optimization(PPO), Autonomous underwater vehicle(AUV)
PDF Full Text Request
Related items