Research On Vehicle Routing Problem Algorithm Based On Deep Reinforcement Learning

Posted on:2023-10-05

Degree:Master

Type:Thesis

Country:China

Candidate:J H Dong

Full Text:PDF

GTID:2569306761459784

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

The logistics industry,as an important component of the country’s economic strength,plays an important role in guiding the country’s economic development and raising the GDP.Against the backdrop of the trend towards global economic integration,the logistics industry has become one of the important economic pillars of China.Modern logistics is flourishing due to the support of information technology.It is particularly important to build a logistics and distribution system that matches the Internet industry through the available technical means.By controlling transport costs,the efficiency of modern logistics can be significantly enhanced and a scientific distribution system management can be achieved.Combinatorial Optimisation is a classical branch of computer science and operations research with a wide range of applications in a variety of fields such as daily life,transportation and manufacturing.The Vehicle Routing Problem,being one of the classical combinatorial optimisation problems,aims to find a set of paths with the lowest transportation cost,these problems often difficult to find a definitive solution to in a polynomial time.Unlike the TSP,VRP has specific conditions such as capacity constraints.It can generate more complex problems depending on the scenario,so it is often attractive to study algorithms for solving such problems.When solving VRP,traditional metaheuristics methods can obtain locally optimal solutions in an acceptable time,but these methods select operator to be executed in a stochastic way,which results in a lot of wasted time.It lacks of strategies to efficiently guide the exploration of the solution space.To address this challenge,this paper proposes a deep reinforcement learning algorithm based on iterated local search to solve VRP,which we call the Learning Action Sequence with restart mechanism.At each step of ILS,our model selects the current optimal action to execute and generates a new solution.It is worth mentioning that our model performs the perturbation operator either actively or passively depending on the benefit of the historical actions.The model framework for the LAS algorithm consists of five components: path simulator,embedding network,action policy network,action pool and restart gate.The input to the model is a feature matrix of problem and solution features.We feed the input to the embedding network through transformation.In our experiments we have chosen to use a sequence encoder as the embedding network,which contains techniques such as multi-head attention mechanisms to prevent overfitting and gradient disappearance.The embedding network eventually outputs a feature vector understood by the deep network,which is stitched together with the current state to form an observation of the current environment.The observations are fed into an improved Deep Q-Network,which eventually selects an optimal action to execute,with each operator in the action pool understood as a specific action.By repeating this process,the LAS algorithm will eventually obtain an optimal solution for the instance.A restart mechanism is designed to explore the solution space more fully,it will explore the instance when falling into a local optimum.We have run numerical experiments on CVRP problems of different sizes and compared the experimental results with improvement methods,construction methods,heuristic algorithms and solvers.The experimental results show that the LAS-restart gives the best results compared to the state-of-the-art methods.We have also performed ablation experiments in which the LAS algorithm without the restart mechanism also gives the best results on CVRP20 and CVRP100.

Keywords/Search Tags:

Deep Reinforcement Learning, Route Planning, Iterated Local Search, Multi-head Attention Mechanisms

PDF Full Text Request

Related items

1	Inventory Control Model Of Fresh Agricultural Products Based On Deep Reinforcement Learning
2	A Study On Rush Order Processing Control Based On Deep Reinforcement Learning
3	Research On Stock Portfolio Strategy Based On Supervised Learning And Deep Reinforcement Learning
4	Heuristic Exercise Recommendation Model Based On Deep Reinforcement Learning
5	Stacker Scheduling And Bin Location Recommendation Based On Multi-task Reinforcement Learning
6	An Empirical Research On The Investment Strategy Of Stock Market Based On Deep Reinforcement Learning Model
7	Commodity Recommendation Model Based On Deep Reinforcement Learning
8	Deep Reinforcement Learning For Dynamic Pricing
9	Research Of Portfolio Based On The Deep Reinforcement Learning And Improved Mean-Variance Model
10	Research On The Production Scheduling Problems Based On Intelligent Optimization Algorithms