Font Size: a A A

Research On Capturing Control Of Robot Arm Based On Deep Reinforcement Learning

Posted on:2021-05-31Degree:MasterType:Thesis
Country:ChinaCandidate:W W HuangFull Text:PDF
GTID:2392330611998882Subject:Mechanical engineering
Abstract/Summary:PDF Full Text Request
On-orbit capture technology is a hot research field in the world's space powers.China's space station is about to be built,and the application of space manipulator is in huge demand.In recent years,deep reinforcement learning has developed rapidly,which can realize the end-to-end control strategy of high-dimensional original input to output without the mathematical model.Based on intelligent robots and space flight target capturing for background,the research problem includes the selection of neural network activation function,the proximal policy optimization algorithm and its improvement measures,trajectory planning of robot moving target capturing based on the proportional guidance method,the task of capturing training with deep reinforcement learning in multiple degrees of freedom mechanical arm and the depth of the scene.It is hoped to provide reference for China's space robot to realize highly intelligent flight target capture.To solve the gradient disappearance problem of activation function in deep neural network,the characteristics and selection methods of activation function were analyzed based on different activation function curves and their derivative curves.So as to serve as the basis for the study of strategy and value function fitting in deep reinforcement learning.Deep reinforcement learning algorithm is the core of strategy generation.This paper studies the process and principle of deep reinforcement learning,and deduces the objective function of the proximal policy optimization algorithm based on the two elements of strategy and the value function.In terms of the problem of balance between variance and deviation,the improvement measures of the algorithm combined with the generalized advantage estimation are proposed,and the effectiveness of the improved algorithm is proved by the simulation experiment.The kinematics model of the robot arm is established based on the traditional robot capturing moving target algorithm.According to the guidance principle of proportional guidance method,the programming of the two-dimensional plane is deduced and extended to the acquisition of three-dimensional moving objects.This paper designs the trajectory planning method of capturing moving target,studies the influence of guidance coefficient and capture velocity on capture trajectory,capture time,etc.,and analyzes the characteristics of the time change curve of joint angle and joint angular velocity.The effectiveness of the planning algorithm based on proportional guidance in capturing moving targets is verified by the simulation of a 6-DOF manipulator.In terms of the application of PPO algorithm in capturing moving target with a robot arm,not only the DRL simulation environment and the controller of DRL modelwere set up,but also the DRL model of capture task was established.Based on round termination condition and interactive process reward,the reward value is designed by piecewise function.In terms of the difference of task urgency in different moments,reward weights of different moments were distinguished.The 2-DOF and 6-DOF robotic arms were used to carry out the training tasks of the proximal policy optimization algorithm,and the neural network structures of state and action space,reward function,strategy and value function were designed in different scenes according to the task and the characteristics of DRL environment.Simulation results show that PPO algorithm is feasible to capture moving targets with robotic arms.Since the moving target in DRL training task is set to move with random speed and random trajectory,in order to compare the proportional guidance planning algorithm with the deep reinforcement learning algorithm,the comparative tasks are designed.The characteristics,advantages and disadvantages of the two methods are discussed in terms of capturing trajectory,joint angle and angular velocity.
Keywords/Search Tags:robot arm capture, deep reinforcement learning, the proportional guidance method, proximal policy optimization
PDF Full Text Request
Related items