Research On Capturing Control Of Robot Arm Based On Deep Reinforcement Learning

Posted on:2021-05-31

Degree:Master

Type:Thesis

Country:China

Candidate:W W Huang

Full Text:PDF

GTID:2392330611998882

Subject:Mechanical engineering

Abstract/Summary:

PDF Full Text Request

On-orbit capture technology is a hot research field in the world's space powers.China's space station is about to be built,and the application of space manipulator is in huge demand.In recent years,deep reinforcement learning has developed rapidly,which can realize the end-to-end control strategy of high-dimensional original input to output without the mathematical model.Based on intelligent robots and space flight target capturing for background,the research problem includes the selection of neural network activation function,the proximal policy optimization algorithm and its improvement measures,trajectory planning of robot moving target capturing based on the proportional guidance method,the task of capturing training with deep reinforcement learning in multiple degrees of freedom mechanical arm and the depth of the scene.It is hoped to provide reference for China's space robot to realize highly intelligent flight target capture.To solve the gradient disappearance problem of activation function in deep neural network,the characteristics and selection methods of activation function were analyzed based on different activation function curves and their derivative curves.So as to serve as the basis for the study of strategy and value function fitting in deep reinforcement learning.Deep reinforcement learning algorithm is the core of strategy generation.This paper studies the process and principle of deep reinforcement learning,and deduces the objective function of the proximal policy optimization algorithm based on the two elements of strategy and the value function.In terms of the problem of balance between variance and deviation,the improvement measures of the algorithm combined with the generalized advantage estimation are proposed,and the effectiveness of the improved algorithm is proved by the simulation experiment.The kinematics model of the robot arm is established based on the traditional robot capturing moving target algorithm.According to the guidance principle of proportional guidance method,the programming of the two-dimensional plane is deduced and extended to the acquisition of three-dimensional moving objects.This paper designs the trajectory planning method of capturing moving target,studies the influence of guidance coefficient and capture velocity on capture trajectory,capture time,etc.,and analyzes the characteristics of the time change curve of joint angle and joint angular velocity.The effectiveness of the planning algorithm based on proportional guidance in capturing moving targets is verified by the simulation of a 6-DOF manipulator.In terms of the application of PPO algorithm in capturing moving target with a robot arm,not only the DRL simulation environment and the controller of DRL modelwere set up,but also the DRL model of capture task was established.Based on round termination condition and interactive process reward,the reward value is designed by piecewise function.In terms of the difference of task urgency in different moments,reward weights of different moments were distinguished.The 2-DOF and 6-DOF robotic arms were used to carry out the training tasks of the proximal policy optimization algorithm,and the neural network structures of state and action space,reward function,strategy and value function were designed in different scenes according to the task and the characteristics of DRL environment.Simulation results show that PPO algorithm is feasible to capture moving targets with robotic arms.Since the moving target in DRL training task is set to move with random speed and random trajectory,in order to compare the proportional guidance planning algorithm with the deep reinforcement learning algorithm,the comparative tasks are designed.The characteristics,advantages and disadvantages of the two methods are discussed in terms of capturing trajectory,joint angle and angular velocity.

Keywords/Search Tags:

robot arm capture, deep reinforcement learning, the proportional guidance method, proximal policy optimization

PDF Full Text Request

Related items

1	Research On Decision-making Method Of Highway Autonomous Driving Based On Reinforcement Learning
2	Research On Urban Traffic Control Algorithm Based On Deep Reinforcement Learning
3	Research On Real-time Home Demand Response Strategy Based On Deep Reinforcement Learning
4	Research On End-to-end Deep Reinforcement Learning Control Of Intelligent Vehicle Based On PPO Algorithm
5	Dynamic Economic Dispatch Research Of Power System Considering Uncertainties Of Renewable Energy Forecast Error
6	Research On Capture Control Strategy Of Space Manipulator Based On Reinforcement Learning
7	Research On Driverless Control Policy Based On Deep Reinforcement Learning
8	Research On SDN Intelligent Routing Optimization Based On Deep Reinforcement Learning
9	Design On Termianl Guidance Law Based On Reinforcement Learning
10	Research On Virtual Unmanned Vehicle Control Based On Deep Reinforcement Learning