Deep Reinforcement Learning In Maximum Entropy Framework With Automatic Adjustment For Path Planning

Posted on:2024-06-12

Degree:Master

Type:Thesis

Country:China

Candidate:Y Y Chen

Full Text:PDF

GTID:2568307076484384

Subject:Control Science and Engineering

Abstract/Summary:

Traditional path planning methods rely on the priori knowledge of environment model,which can’t be widely used in unknown environments and complex tasks.In recent years,deep reinforcement learning has been applied to solve the problem of motion planning in the highdimensional environment of robots,and substantial breakthroughs have been made in such aspects as autopilot,mobile robot navigation,and robotic arm trajectory planning.In order to enhance the generalization of path planning methods,this paper applies deep reinforcement learning to path planning tasks based on adaptive maximum entropy adjustment,which enables an agent to autonomously plan optimal paths for different tasks.The main contributions of this paper can be summarized as follows:The reward function greatly determines the convergence rate in deep reinforcement learning.In order to avoid the problem of reward sparsity in deep reinforcement learning,a combined reward system applicable to solving path planning problems is proposed.First,a goal-guided term,a penalty term and an additional reward are considered for the reward system,and then each reward term is combined with different proportion into one as whole reward system.To conduct the comparison experiments,three types of experimental scenarios are designed,in which the complexity and the difficulty are progressively increased.A generic combinatorial reward system for path planning has been found,which effectively solves the problem of policy non-convergence due to sparse rewards in reinforcement learning.Finally,the generality of the proposed combined reward system is verified in several experiments with complex scenarios.For the difficulty of keeping balance between exploration and exploitation in deep reinforcement learning,a deep reinforcement learning algorithm with adaptive maximum entropy adjustment is proposed.The proposed method achieves automatic adjustment of temperature parameters so that the entropy can vary among different states to control the degree of exploration,which reduces the possibility of learning suboptimal strategies.The proposed method effectively enhanced balance of exploration and exploitation in deep reinforcement learning.The effectiveness and superiority of the proposed deep reinforcement learning algorithm with adaptive maximum entropy adjustment are verified in many experiments.

Keywords/Search Tags:

deep reinforcement learning, path planning, obstacle avoidance, reward function, maximum entropy

Related items

1	Motion Control Method Of Underwater Manipulator Based On Deep Reinforcement Learning
2	Research And Application Of Agents Obstacle Avoidance And Path Planning Based On Deep Reinforcement Learning
3	Research On Mobile Robots Obstacle Avoidance Planning Based On Reinforcement Learning Algorithm
4	Research On Obstacle Avoidance For AUV Based On Reinforcement Learning
5	Research On Path Planning Algorithm Based On Deep Reinforcement Learning
6	Research On Active SLAM Algorithm Based On Deep Reinforcement Learning In Complex Environment
7	Path Planning Of Patrol Robot Based On HPSO And Reinforcement Learning
8	Research And Implementation Of Obstacle Avoidance Method For Mobile Robot Based On Deep Reinforcement Learning
9	Design And Implementation Of Environment Based Obstacle Avoidance Path Planning System For Unmanned Vehicle
10	Research On Fast Obstacle Avoidance Path Planning Method Based On Q-learning Algorithm