| With the development of The Times,along with people’s pursuit of a better life,more and more complex tasks begin to emerge.Therefore,we expect robots to be able to quickly solve tasks in various complex situations and even urgently expect robots to be able to solve tasks in an increasingly wide range of unknown environmental models.Therefore,it requires the intelligent robot to continuously possess a higher level of intelligence of self-learning type,which means in the face of complex environment,new task,unknown model and other tasks,the robots can independently learn a skill to solve these problems.Using the robot to complete assembly is the main method in space assembly,and these tasks is constantly increasing the demand for robot intelligence.In the past few years,with the rapid increase of computer’s computing power,the rapid development of machine learning continues to improve the intelligence of the computer.When introducing them into robotics,robots can also acquire a new form of intelligence.This paper mainly focuses on the study of peg-in-hole assembly as a basic assembly form,including assembly strategy based on impedance control,assembly strategy based on the deep reinforcement learning,simulation training and relevant comparative experiments.In the research of assembly strategy based on impedance control in this paper,Firstly,the forward kinematics,inverse kinematics and microkinematics of the manipulator arm are derived,which provides relevant contents for the controller design of the manipulator arm.Then,the jamming phenomenon in the assembly task is studied to analyze the main reasons hindering assembly,and the overall assembly process is designed according to the jamming phenomenon and the basic process of assembly.Finally,a large number of comparative experiments are carried out according to the parameters involved,and the relative optimal parameter values are obtained.In the assembly strategy research based on deep reinforcement learning,the main deep reinforcement learning algorithm adopted is TD3 reinforcement learning algorithm improved on DDPG algorithm.Firstly,the basic content of deep reinforcement learning algorithm is studied,the basic formula of DDPG algorithm is derived,and the effectiveness of TD3 algorithm for its improvement is analyzed.Finally,the TD3 algorithm is improved in several aspects to accelerate the computational efficiency of training and the convergence of the network.The main improvements include adaptive annealing until exploration,network structure pre-training,improved replay buffer structure and so on.The computation efficiency of training is accelerated and the convergence of network is accelerated.Finally,a transfer learning algorithm is designed for the transfer process from simulation experiment to real experiment.The last part is simulation training and experiment.This paper mainly uses the V-rep simulation platform and python to complete the simulation training.Firstly,a two-stage action director and a reward function are designed for the assembly task,and then the effectiveness of the improved scheme is proved through a multi-group comparison experiment.Then the effect of the designed impedance control is verified in the aspects of contact force and flexibility.In the end,the two strategies are compared with each other according to various interference signals.Finally,the scheme and related software are designed to complete the corresponding verification experiment with UR5 manipulator. |