Research On Autonomous Trajectory Planning Of Manipulator Based On Deep Reinforcement Learning

Posted on:2023-09-16

Degree:Master

Type:Thesis

Country:China

Candidate:H Y Chi

Full Text:PDF

GTID:2568307127982929

Subject:Electrical engineering

Abstract/Summary:

PDF Full Text Request

With the development of the times and the progress of society,the demand for manipulator is also increasing.Trajectory planning algorithm is an important research direction of manipulator,but the traditional trajectory planning algorithm is highly dependent on the environment.The algorithm should plan the trajectory according to the environment in advance,so as to realize the movement of manipulator.Therefore,the traditional trajectory planning algorithm cannot adapt to the unknown environment.Thanks to the rapid development of power electronics,computer and other related technologies,deep reinforcement learning algorithm has become an important research direction in the field of artificial intelligence,and some research progress has been made in the trajectory planning of manipulator.Therefore,this paper will study the use of deep reinforcement learning algorithm to realize the end-to-end traj ectory planning process of manipulator.Firstly,the D-H parameter method of the manipulator is analyzed,and then the kinematics expression of the manipulator is derived.Then,the FetchReach robot simulation environment based on XML data format is introduced.In order to successfully run the simulation environment,a simulation platform based on physical engine MuJoCo is built.At the same time,for the implementation of the deep reinforcement learning algorithm based on PyTorch framework,the third-party module Gym is introduced,and the compiler is implemented on PyCharm.Secondly,aiming at the problem that traditional algorithms such as DQN(Deep QLearning)and SARSA(State-Action-Reward-State-Action)cannot be used in continuous space,DDPG(Deep Deterministic Policy Gradient)is algorithm that used to realize the autonomous trajectory planning of manipulator.Under the condition of the initial DDPG algorithm,due to the singleness of the binary reward function and the low sampling efficiency resulting in the low training efficiency of the manipulator.Therefore,the real-time partition reward function is designed,and the prioritized experience replay mechanism is adopted.The real-time partition reward function realizes the dynamic acquisition of reward value according to the principles of real-time and partition,and solves the problem of poor adaptability of binary reward function to the environment.According to the experiment,the manipulator can achieve 100%success rate of trajectory planning after about 250 epochs of training.The prioritized experience replay mechanism gives each sample the corresponding sampling probability to reflect the training value of the sample,which overcome the problem of low sampling efficiency of the uniform experience replay mechanism,so as to further improve the training efficiency.The experimental results show that about 190 epochs of training can achieve 100%success rate.Finally,the improved DDPG algorithm has the problems of overestimation bias,high variance and sparse reward,which leads to low sample utilization.To solve the above problems,a TD3(Twin Delayed Deep Deterministic Policy Gradient Algorithm)algorithm based on HER(Hindsight Experience Replay)algorithm is adopted.TD3 algorithm mainly optimizes the overestimation bias and high variance of DDPG algorithm.HER algorithm adopts the idea of learning from failure to solve the problem of sparse reward,so as to improve the utilization of samples.The experiment shows that the trajectory planning action of the manipulator reaches 100%success rate after only about 50 epochs of training,and the performance of the algorithm is improved by 23%.Therefore,it lays a foundation for subsequent migration to the physical platform.

Keywords/Search Tags:

Deep Reinforcement Learning, Manipulator, Trajectory Planning, Reward Function

PDF Full Text Request

Related items

1	Motion Control Method Of Underwater Manipulator Based On Deep Reinforcement Learning
2	Research And Implementation Of Sparse Reward Algorithm Based On Reinforcement Learning For Virtual Shooting Scenes
3	Simulation For Manipulator Trajectory Planning Based On Deep Reinforcement Learning
4	Research On Trajectory And Path Planning Method Of Mobile Manipulator Based On Reinforcement Learning
5	Research On Adaptive Sliding Mode Robust Trajectory Tracking Control Of Robotic Arm Based On Deep Reinforcement Learning
6	Research On Path Planning Algorithm Based On Deep Reinforcement Learning
7	Deep Reinforcement Learning In Maximum Entropy Framework With Automatic Adjustment For Path Planning
8	Research On Manipulator Grasping Method Based On Reinforcement Learning And Meta-learning
9	Research And Application Of Deep Reinforcenment Learning Algorithms Based On Reward Shaping
10	Study Of Robot Arm Control Based On Deep Reinforcement Learning