| Manipulator grasping is a basic and important skill in the field of robotics.It is widely used in manufacturing robots,industrial robots,service robots and other fields.At present,the main grasping methods can be divided into three categories: traditional methods,methods based on deep learning and methods based on deep reinforcement learning.The traditional robotic arm grasping method can only perform grasping operations on the basis of simple structured scenes and known object model information,and has insufficient grasping ability in unstructured scenes composed of unknown objects.The robotic arm grasping method based on deep learning usually requires a large amount of labeled data to train the network.The grasping performance of this method mainly depends on the detection ability of the visual algorithm,and the robotic arm does not have the ability to learn independently and cannot adapt to unknown scenes.Therefore,based on the deep reinforcement learning algorithm,this thesis studies the target grasping problem of the manipulator in unstructured scenes.The main work of this thesis is as follows:(1)In this thesis,a target grasping method based on deep Q network is proposed.The Markov model of the grasping process of the manipulator is established.The network structure and learning process of the target grasping algorithm are introduced.The network prediction results are expressed in the form of heat map.In order to solve the problem of insufficient learning of grasping direction in grasping network,this thesis proposes an unreasonable angle constraint policy.By dynamically evaluating the grasping effect in different directions,the grasping action in unreasonable direction is constrained to different degrees,which improves the learning efficiency and grasping success rate of target grasping.Then,the constraint factor in the unreasonable angle constraint policy is introduced.This thesis proposes a grasping reward function with angle information to guide the grasping network to learn a more accurate and detailed grasping policy.Aiming at the problem of sparse reward in target grasping training,this thesis proposes a grasping training method based on Hindsight Experience Replay,which effectively improves the efficiency of target grasping training.The feasibility and advantages of the proposed target grasping method are verified in the V-REP simulation environment.(2)In order to solve the problem of target grasping in complex scenes with dense placement of multiple objects,this thesis proposes a target grasping method based on the coordination of pushing and grasping actions.Firstly,the pushing policy is used to expand the grasping space around the target,and then the target is grasped.The method effectively improves the success rate of target grasping in complex scenes.In view of the goal capture task studied in this thesis,the problems existing in the current promotion methods are analyzed.This thesis proposes a pushing reward composed of three reward methods,which evaluates the effect of pushing from three different perspectives,and guides the pushing network learning more effective pushing policy.Then,this thesis proposes a collaborative mechanism for pushing and grasping actions.The collaborative mechanism is composed of measurement factors and collision detection methods,which can accurately determine whether the current environmental state meets the grasping conditions and provide a basis for action selection.Aiming at the problem of low learning efficiency in complex scenes,this thesis proposes a push training method based on curriculum learning,which divides complex push tasks into multiple push sub-tasks and trains them in order from easy to difficult,thus accelerating the push training process in complex scenes and effectively improving the effect of push training.Finally,multiple sets of comparative experiments and ablation experiments are set up in the simulation environment to verify the superiority of the target grasping method of pushing and grasping synergy.(3)In order to verify the grasping ability and practical application ability of the proposed method in the real environment,an experimental platform for manipulator grasping is built,and the hand-eye calibration of the manipulator is completed.Then,based on Python language and Pytorch framework,a cooperative control system for manipulator pushing and grasping is developed.The trained push and grasp model in the simulation environment is directly transferred to the real environment.Through a series of target grasping experiments,the effectiveness and generalization ability of the target grasping method proposed in this thesis are verified. |