| With the deep exploration of the ocean,the number of scientific research and underwater activities is also growing up,which puts forward higher requirements for underwater operation equipment and underwater operation technology to support these activities.As one of the mainstream underwater operation tools,underwater hydraulic manipulators are usually placed on underwater robot to complete underwater operations.At present,the main control mode of the underwater manipulator is the master-slave servo type.Although the master-slave servo type is mature and widely used,its operation effect largely depends on the perception of the underwater environment and on-site operation of the staff.The whole process of the master-slave servo takes a long time and its accuracy is low.which has become increasingly unable to meet the needs of efficient and complex tasks.How to apply autonomous operation technology to underwater complex tasks is one of the key technical problems to be solved.There are many problems to complete the autonomous operation of underwater manipulator,such as the modeling and motion control of underwater manipulators,underwater target recognition and positioning,underwater manipulator trajectory planning,etc.This dissertation's research focuses on the motion control of the underwater hydraulic manipulator,which combines the deep reinforcement learning with the motion control of the manipulator to achieve the autonomous grasping of the typical objects.The contents are as follows:Firstly,the kinematic model of the underwater hydraulic manipulator is established,and the workspace of the manipulator is solved.Aiming at the poor positioning accuracy of the manipulator,the joint calibration and kinematic calibration are carried out to ensure that the positioning accuracy of the manipulator meets the requirements of grasping.Secondly,the underwater hydraulic manipulator's motion control system based on ROS is built,including the positioning of the target with the help of monocular vision and ArUco marker in the upper computer.The communication between the underwater manipulator,deep reinforcement learning agent and simulation environment is also built.The PPO algorithm for continuous action is chosen and trained to land.A simulation training scene named "SIA7FARMPickANDPlace-v1" for the underwater manipulator based on OpenAI-Gym and MuJoCo is built and a new reward function for target grasping tasks is adopted.The training results show that the convergence speed of the new reward function and the average reward of each episode are better than the traditional reward function,and the success rate of grasping tasks in the simulation environment can reach about 90%.Finally,the on land and underwater experiments are designed and carried out.The error of the manipulator's end path and joint angles in the process of experiments are analyzed,and the results are compared with the method based on traditional visual servo,which verified the feasibility and effectiveness of the method of underwater object grasping based on deep reinforcement learning. |