Font Size: a A A

Optimization And Transfer Of Deep Reinforcement Learning-Based Variable Speed Limit Control Strategy

Posted on:2021-06-30Degree:MasterType:Thesis
Country:ChinaCandidate:Z M KeFull Text:PDF
GTID:2492306476957079Subject:Transportation planning and management
Abstract/Summary:PDF Full Text Request
With the population growth and economic development,the demand for vehicle travel has increased rapidly,and the traffic congestion has become increasingly serious on expressways.In order to improve the efficiency of the expressways,the key is to alleviate traffic congestion at the merging bottleneck.Variable speed limit control technology can dynamically adjust the speed limit to control the flow to the downstream merging bottleneck,thereby improving the bottleneck capacity.Existing variable speed limit control strategies can be divided into optimization control and feedback control.The optimization control algorithm is highly dependent on traffic flow models,and its control effects are determined by the accuracy of traffic flow models,so the difficulty of the optimization control algorithm is to develop accurate traffic flow models.The feedback control algorithm has low dependence on traffic flow models,but the response speed of this method is slow,and it is difficult to completely avoid the formation of traffic congestion.In addition,the transferability of these algorithms needs to be improved.When applied to a new scenario,the optimization algorithm needs to recalibrate traffic flow models,and the feedback control algorithm needs to manually adjust the controller parameters.Therefore,the existing variable speed limit control strategies still have some room for improvement in model dependence,control effects and transferability.With the development of artificial intelligence technologies,deep reinforcement learning algorithms can automatically adapt to multiple environments and achieve optimal control effects without relying on specific traffic models.Therefore,based on the deep reinforcement learning algorithm,this paper develops a variable speed limit control strategy,and uses artificial intelligence technologies such as transfer learning to improve the transferability of the variable speed limit control strategy.First,based on the basic cell transmission model,the cell transmission model is improved,and a simulation platform for variable speed limit control of freeways is developed.In the cell transmission model,capacity drop phenomenon of the merging bottleneck and the stop-and-go phenomenon under congestion are simulated;based on effects of variable speed limit control on traffic flow,the variable speed limit control is integrated into the cell transmission model,and the driver’s compliance with the variable speed limit control is reflected;calibration and verification processes of the cell transmission model are introduced.Based on the modular design,the simulation model and the control algorithm are modularized,and the real-time traffic data output interface and control action input interface of inter-program language are defined.The simulation calculation process is designed to realize the real-time interaction between the simulation model and various control algorithms.Finally,an real freeway segment is taken as the research object to construct a traffic flow simulation example.Secondly,based on the Double Deep Q Network(DDQN)algorithm,a variable speed limit control strategy for the improvement of traffic efficiency is proposed.Define the algorithm input(i.e.,state)as the real-time data collected by the road detector and the speed limit value of the previous control cycle,and then define the algorithm output(i.e.,action)as the speed limit value executed of the variable speed limit control area;aiming to improve traffic efficiency,the objective function is designed(that is,the reward function).The main influence variable of the reward function is the traffic density of the bottleneck.When the density of the bottleneck equals the critical density,which means the bottleneck traffic state is the optimal state and the traffic efficiency is maximized,the reward value is the highest.Then,introduce the feedback-based variable speed limit control and the Q learning-based variable speed limit control,as well as the training process of the DDQN-based variable speed limit control.After training,the agent can take a series of actions to maximize rewards.In the tests,a stable demand scenario and a fluctuating demand scenario are designed.In both demand scenarios,three control strategies are applied.The results show that in the two demand scenarios,the control effects of the DDQN algorithm are superior to the feedback control algorithm and the Q learning algorithm.The DDQN algorithm reduces the total time spent by 50.47% and35.53%,and the feedback control algorithm reduces the total time spent by 33.98% and 12.88%,while the Q-learning algorithm reduces the total time spent by 40.11% and 28.82%.The results also indicate that the DDQN algorithm improves the capacity of the bottleneck area and reduces the spatial-temporal distribution of the congestion state,so it can minimize the total time spent.Finally,in order to improve the transferability of the variable speed limit control strategy,this study integrates transfer learning into the DDQN-based variable speed limit strategy,and transfers the strategy in three types of scenarios.First of all,a transfer learning-based strategy transferring algorithm is proposed.Overspeed scenarios,inclement weather scenarios and different capacity drop scenarios are designed.In overspeed scenarios,transfer learning shortens the training process by 32.3% to 56.7% and achieves the optimal control effects.In inclement weather scenarios,when the differences between the source scenario and the target scenarios are not large,transfer learning shortens the training process by 58.0% and 52.1% and achieves the optimal control effects.When the differences grow,transfer learning leads to local optimization and cannot achieve the optimal control effects.In different capacity drop scenarios,transfer learning shortens the training process by 50.6% and 44.0% and achieves the optimal control effects.The results show that the differences between the source scenario and the target scenarios are smaller,which means the correlation between scenarios are stronger,the transfer learning is more effective.
Keywords/Search Tags:expressway, variable speed limit, efficiency, road bottleneck, traffic simulation, deep reinforcement learning, transfer learning
PDF Full Text Request
Related items