Research Of Compliant Assembly Technology Of Manipulator Based On Imitation Learning And Reinforcement Learning

Posted on:2024-05-17

Degree:Master

Type:Thesis

Country:China

Candidate:Y F Cai

Full Text:PDF

GTID:2568306944963819

Subject:Mechanical engineering

Abstract/Summary:

PDF Full Text Request

With the progress of industrial robot technology and the improvement of industrial automation requirements,robots are gradually being used to perform increasingly diverse tasks.Over the past decade,automated assembly by robots has been a challenging research field,with highperformance peg-in-hole assembly being a hot topic.This article aims to study the compliant assembly technology of manipulators,and proposes a stategy for learning assembly through imitation and reinforcement learning,and achieve automated peg-in-hole assembly tasks of manipulators while ensuring compliance during the task.First,this paper studies the generation of demonstration data for imitation learning.Expert demonstrations play a crucial role in imitation learning training,and the generation of demonstration data is a difficult issue.This paper studies the generation methods of demonstration data in virtual and real environments respectively.In the virtual environment,the reinforcement learning training and evaluation method is used to obtain demonstration data.In the real environment,we model the demonstration data of the manipulator peg-in-hole assembly task,use expert kinesthetic guidance to drag and teach to obtain demonstration data,and propose a"high-frequency sampling,low-frequency recombination" method to process and expand the demonstration data.Second,to solve the problem of low sample utilization in the traditional GAIL framework algorithm,this paper combines the idea of hindsight experience replay and proposes hindsight transformation generative adversarial imitation learning algorithm.The algorithm converts part of the trajectories generated by the generator into expert-like data,and the expert-like data also participates in the training of the discriminator,improving the sample utilization while solving the problem of insufficient expert demonstration data.This paper verifies the proposed algorithm in the Isaac Gym virtual simulation environment,proving that HT-GAIL can accelerate the convergence speed of training,learn policies similar to expert demonstration data,and lay the foundation for subsequent peg-in-hole assembly experiments on physical platforms.Third,in order to further improve task performance and enable policies to exceed the level of expert demonstration data,this paper combines offline reinforcement learning technology and proposes an improved offline adversarial motion priors algorithm.The algorithm deploys the generator policy network parameters trained by HT-GAIL as offline data in the environment and further optimizes the existing policy through the improved AMP algorithm,enabling the policy to complete higher-level tasks with a small amount of training time.This paper verifies the proposed algorithm in the Isaac Gym virtual environment,proving that the improved offline AMP algorithm can optimize the policy in a short time,and the new policy can even exceed the level of expert demonstration.Finally,this paper builds a physical experiment platform for the autonomous compliant peg-in-hole assembly of the manipulator,collects expert demonstration data of peg-in-hole assembly task with 0.80 mm clearance,and conducts training using HT-GAIL algorithm.The experiment results show that the policy converges in about 11.5 hours,and the trained policy can complete the peg-in-hole assembly task with 0.80 mm clearance with 87%success rate,while meeting the task compliance requirements.The experiment proves that HT-GAIL can learn the task policy from the demonstration data,and the comparison experiment with the GAIL algorithm proved that the HT-GAIL algorithm can accelerate the convergence speed.To further improve the performance of the policy,we trained the improved offline AMP algorithm using the network parameters of the policy as offline data.After about 5 hours of training,the success rate of the 0.80mm clearance task increases from 87%to 95%,the success rate of the 0.52mm clearance task increased from 55%to 78%,and the success rate of the 0.18mm clearance task increased from 12%to 57%,while meeting the task compliance requirements.The experiment has demonstrated that the improved offline AMP algorithm can significantly optimize the performance of the policy with only a small amount of time,exceed the level of expert demonstration data and accomplish higherprecision assembly tasks.

Keywords/Search Tags:

artificial intelligence, imitation learning, reinforcement learning, peg-in-hole assembly, manipulator compliance

PDF Full Text Request

Related items

1	Robotic Peg-in-hole Assembly Algorithm Research Based On Compliance Control
2	Reinforcement Learning Agent Design Based On Deep Perception And Imitation Learning
3	Research And Implementation Of Imitation Learning For Complex Tasks In Large-scale Environments
4	Reinforcement Learning-based Search Strategy For High-precision Peg-in-Hole Tasks
5	Research On Manipulator Imitation Learning Based On Meta-learning
6	Research On Reinforcement Learning Method For Game Manipulation Behavior Imitation
7	Research On Compliance Peg-in-hole Assembly Method Of Industrial Robot
8	Research On Robotic Peg-in-Hole Precise Assembly Technology Based On Active-Passive Compliance
9	Inverse Reinforcement Learning And Imitation Learning With Applications In Intelligent Robotics
10	Supervised Reinforcement Learning:methods And Applications