Research And Implementation On Route Planning Algorithm Based On Reinforcement Learning Under Intelligent Travel Scene

Posted on:2021-12-29

Degree:Master

Type:Thesis

Country:China

Candidate:K Xu

Full Text:PDF

GTID:2518306050971349

Subject:Communication and Information System

Abstract/Summary:

PDF Full Text Request

In recent years,the demand for mass travel has been increasing in diversity and complexity,and mobile phone applications for tourism travel have appeared in an endless stream.The problem of travel route planning has become a crucial issue in contemporary society.Depthfirst and wide-first algorithms are two prevalent algorithms used to plan paths for users in the current electronic maps.However,facing complex and diverse travel routes and a variety of users’ requirements,poor ability to generalize and various demands of users have become the most challenging bottleneck of traditional algorithms,specifically manifested in the following aspects.Firstly,safety is insufficient.The current methods mainly take the shortest path as the primary optimization goal and fail to consider the importance of path safety performance.Secondly,the flexibility is not satisfied.The current route planning methods do not support customized selection of multiple destinations and cannot meet the users’ personalized journey planning demands.Thirdly,the data utilization rate is low.Most existing methods only consider the location information of users and destinations,and do not use multi-source data published from government and transportation departments.Therefore,it is difficult to plan a safe and comfortable route for users.To tackle the above problems,we study and realize an intelligent travel route planning algorithm based on reinforcement learning in this thesis.First of all,to meet users’ travel security needs,a Q-learning safety route planning algorithm based on policy guidance mechanism has been proposed.We construct the reinforcement learning reward function based on safe index according to road network data and government-published crime data.Based on this,we model the safe route planning problem as a Markov decision process.At the same time,combining with the heuristic exploration method of the policy guidance mechanism based on the artificial potential field function,a single target safety route planning task is completed.Experimental results show that the proposed algorithm achieves state-of-the-art performance between balancing safety and distance.Meanwhile,the algorithm’s convergence time is 31.52% lower than using greedy exploration strategy.Next,we propose a multi-destination route planning algorithm based on the Actor-Critic(AC)algorithm in the deep reinforcement learning field to satisfy the needs of users passing through multiple destinations and making the total length shortest.The algorithm builds a policy network and evaluation network based on pointer network and long short-term memory.It uses AC framework to train the policy network parameters and the evaluation network,reducing the algorithm model’s dependence on massive,high-quality labeled data.Meanwhile,the convergence rate of the deep reinforcement learning algorithm is accelerated by pre-training with the labeled data and finally complete the multi-destination access sequential route planning task.Experimental results show that the algorithm proposed can effectively shorten the total path length of the destinations,compared with the genetic algorithm and distance matrix mapping method of multi-destination route planning.The route planning algorithm based on reinforcement learning under intelligent travel scene researched and implemented in this thesis can be widely used in travel application software such as electronic maps to provide users with personalized travel route planning services.

Keywords/Search Tags:

Reinforcement Learning, Route Planning, Q-learning, Actor-Critic, Policy Guidance Mechanism

PDF Full Text Request

Related items

1	Research On Multiagent Cooperation And Applications Based On Reinforcement Learning
2	Exdloratory Action Correction Algorithm Based On Actor-Critic
3	Robust Policy Gadient Algorithm Based On Actor-Critic In Deep Reinforcement Learning
4	Option Learning Method Research With Double Actor-Critic Architecture
5	Research On Policy-Constrained Reinforcement Learning
6	Research On Fast Policy Gradient Algorithms Of Reinforcement Learning Based On Adaptive Learning Rate
7	Researches On Improvement Of Fixed Temperature Soft Actor Critic Algorithm
8	Research On Robot Path Planning Based On Fusion Of Reinforcement Learning And Heuristic Search Algorithms
9	Research On Three Key Problems In Reinforcement Learning
10	Research On Deep Reinforcement Learning Methods For Autonomous Grasping Control Of Robots