Font Size: a A A

Vehicle Routing Problem Study Based On Reinforcement Learning And Attention Mechanism

Posted on:2024-06-27Degree:MasterType:Thesis
Country:ChinaCandidate:L C WangFull Text:PDF
GTID:2542307097962809Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,China’s transport industry has experienced rapid growth,resulting in a surge in the number of vehicles and an aggravation of traffic congestion.The enhancement of vehicle path optimization primarily revolves around the consideration of real-time traffic conditions and effective communication and collaboration among vehicles.Existing research predominantly focuses on resolving optimal solutions for single-or dual-objective scenarios,while failing to ascertain the potential alterations in optimal routes that may arise in the future.Consequently,this paper adopts an agent model-assisted optimization algorithm to investigate the predicament of vehicle path planning.The objective is to attain a more proficient and intelligent approach to traffic management and control,thus optimizing efficiency.To tackle this issue,the paper introduces a novel concept of utilizing historical trajectory data to derive a more rational and viable path solution through a data-dr-iven optimization approach.The primary research objectives of this paper are outlined as follows:1.This paper proposes an enhanced data-driven agent-assisted Actor-Critic algorithm for addressing the vehicle routing problem with time windows.To optimize and enhance the algorithm,a baseline is introduced within the Actor-Critic algorithm,effectively reducing the gradient and ensuring smoother training.The optimization process is further aided by an agent model based on Random Forest,which assists in path evaluation by collecting trajectory data in real-time.The evaluation results serve as a guiding factor for the sear-ch process of the ActorCritic algorithm.In the experimental section,a vehicle routing problem of size 10 is utilized for testing and analysis.The analysis primarily focuses on key metrics such as average reward,loss function,and generated paths of the network.Additionally,the impact of different learning rates on the model is examined.The comparative experiments primarily involve the Actor-Critic algorithm without the agent model and the widely employed Particle Swarm Optimization(PSO)algorithm,which is commonly utilized for solving combinatorial optimization problems.The test data analysis demonstrates that the proposed algorithm outperforms the traditional heuristic algorithm in terms of path cost reduction.Moreover,the algorithm exhibits superior generalization capability and stability.2.The paper introduces a sequence-to-sequence model with an Attention Mechanism to address the vehicle routing problem with capacity constraints.Comprising an initial encoding module,an encoder,and a decoder.The initial decoding phase serializes the problem into an input vector.The encoder primarily incorporates a multi-headed attention layer to capture crucial information,while the decoder employs single-headed attention with a masking mechanism to filter nodes that violate capacity constraints and calculate node probability distributions.The model is trained using a policy gradient algorithm,with the final sequence of nodes determined through cost function calculation and cumulative reward evaluation.Mini-batch gradient descent is employed to minimize training resource consumption.In the experimental section,paths generated using different decoding strategies are analyzed and tested using the proposed model for vehicle paths with node sizes of 10 and 20,respectively.The analysis focuses on average path cost,model loss variation,and specific path records.The proposed model is compared with Deep Q-Network(DQN)and Genetic Algorithm(GA)in terms of testing time and path cost,demonstrating that the proposed model yields lower path costs in the testing phase.Furthermore,the model is applied to instances with node sizes of 50 and 100,and the results are visually presented.The experimental findings showcase the high accuracy and generalization capability of the proposed sequence-to-sequence model based on attention mechanisms in solving vehicle path planning problems with capacity constraints.The model proves to be an effective solution for such problems.The paper introduces a reinforcement learning-based approach for studying Vehicle routing problem with time windows and capacity constraints,enabling timely and efficient vehicle routing outcomes.
Keywords/Search Tags:Vehicle Routing Problem(VRP), Attention Mechanism, Seq2seq model, Datadriven, Reinforcement Learning(RL)
PDF Full Text Request
Related items