Research On Multi-Agent Reinforcement Learning Vehicle Scheduling Algorithm Based On Harmonic Value Network

Posted on:2024-08-18

Degree:Master

Type:Thesis

Country:China

Candidate:K M Yang

Full Text:PDF

GTID:2542307133991909

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the popularity of smart phones,online car rental and online car rental platforms have gradually entered the public view and quickly become a popular choice for travel.China’s online ride-hailing users have increased year by year,reaching 400 million by 2022.Therefore,how to adopt efficient vehicle scheduling strategy not only plays an important role in increasing the income of the platform and drivers,but also greatly alleviates traffic congestion,improves the public travel experience,and increases passenger comfort and satisfaction.For example,according to historical data,dispatching vehicles to popular areas with large numbers of people in advance can greatly improve the order response rate and reduce the situation that passengers wait for a long time in crowded areas and drivers in unpopular areas have no order to pick up.The research of vehicle scheduling is mainly to schedule vehicles to meet more orders through establishing multi-agent reinforcement learning model based on certain historical data,and improve the benefits of the platform and drivers.To improve the order response rate and income,the main challenges lie in: first,each driver is treated as an unrelated individual,and only relies on greed to select the nearest order when receiving the order,while missing the overall optimal combination solution in the region;Second,although the historical data is huge,it can not traverse all situations,and it is easy to generate out-of-distribution data(state-action pairing that does not appear in the historical data)during offline training.When estimating the value of these out-of-distribution data,the value function is often unable to be estimated accurately,resulting in the situation that the target value deviates from the actual value.Aiming at the two challenges faced by vehicle scheduling,this paper proposes the Shared Attention Reinforcement Learning(SARL)based on shared attention and the Uncertain Weighting Harmonic Twin-critical Network(UWTC)based on uncertainty weight.The SARL is mainly based on multi-agent reinforcement learning,which adds the variable shared attention of multiple heads of attention,lets the agents focus on each other’s position by inputting the shared vector,and considers the global optimal solution rather than the greedy suboptimal solution in vehicle scheduling.The UWTC mainly incorporates uncertainty weighting modules and harmonic twin critic network modules on the basis of the Actor-Critic algorithm,in order to better estimate the value function and achieve the goal of selecting better strategies.The innovations of this paper are as follows:(1)Propose two vehicle scheduling algorithms based on multi-agent reinforcement learning,SARL and UWTC,are proposed;(2)A shared attention module based on multi-head attention mechanism is proposed for vehicle scheduling,allowing vehicles to focus and cooperate with each other to achieve the optimal combination solution within the grid;(3)A multi-agent reinforcement learning algorithm based on uncertainty weighting module and harmonic dual Critic module is proposed for large-scale vehicle scheduling in different regions.In addition,this paper conducted experimental tests on the two proposed multi-agent reinforcement learning models based on real scenarios and real datasets.The results show that both SARL and UWTC models have achieved improvements in order response rate and total service value compared to existing mainstream models.

Keywords/Search Tags:

multi-agent reinforcement learning, vehicle scheduling, attention mechanism, uncertainty analysis, value network optimization

PDF Full Text Request

Related items

1	Research And Implementation Of Collaboration Strategy Generation Technology Based On Multi-Agent Deep Reinforcement Learning
2	Research And Implementation Of Regional Road Traffic Collaboration Based On Multi-agent
3	Research On Distributed Intelligence-based UAV Edge Resource Scheduling Mechanism
4	Dynamic Resource Allocation Of Aerial-based Relay Based On Deep Reinforcement Learning
5	Research On Three-dimensional Trajectory Design And Resource Scheduling Optimization Algorithm For Complex UAV Network
6	Research On Multi-Agent Deep Reinforcement Learning Algorithm With Collision Times Constraints
7	Research On Network-Wide Traffic Signal Control Model Based On Reinforcement Learning
8	Research On Multi-Agent Architecture And Dynamic Scheduling Mechanism Of CPPS
9	A Learning-Guided Optimization Algorithm For Large-Scale Hybrid Flow Shop Problem
10	Research On Vehicle Edge Computing Task Scheduling Based On Multi-agent Reinforcement Learning