As the problem of urban traffic congestion becomes more and more serious,intelligent traffic signal control technology,as one of the most potential methods to alleviate traffic congestion,has received widespread attention at home and abroad.Due to the complexity,dynamics and non-linear characteristics of the transportation system,the traditional adaptive traffic signal control algorithm based on modeling the transportation system has become unable to cope with the ever-changing traffic flow.In response to this problem,some scholars have proposed a traffic signal optimization control algorithm based on reinforcement learning which has become a hotspot in recent years,because reinforcement learning methods do not require a specific model of the external environment,and can achieve good control effect in the complicated transportation system through continuous interaction with the outside world,trial and error learning,and real-time adjustment of strategies.This paper designs an optimal control algorithm for urban intelligent traffic signals based on deep reinforcement learning algorithms.The main work of this paper is as follows:(1)A signal optimization control algorithm for single intersection based on deep reinforcement learning algorithm is proposed to reinforce the learning agent to sense real-time traffic status and output the best signal control scheme for intersection.Compared with most of the current signal control algorithms,the algorithm proposed in this paper not only can flexibly switch phases,but also adds a step that requires a yellow light warning before switching phases in the algorithm,which improves safety;and the defined state dimension is low,but contains the vast majority of traffic information,which enables the reinforcement learning model to converge quickly,laying a foundation for the design of road network signal control schemes.Simulation results show that the algorithm can effectively reduce the average waiting time of vehicles at intersections by 54.07% compared to the fixed timing method.(2)The road network signal optimization control algorithm based on the central agent is designed.The central agent senses the traffic status of the entire road network and adjusts the signal scheme of all intersections in real time.In this algorithm,the state is defined as the length of the road queuing,the number of vehicles,and the speed of vehicles at all intersections,and the motion is defined as the phase scheme of the next time interval of all intersections.The proposed algorithm only needs to be equipped with a central server,there is no need to communicate between intersections,the application cost is low,and it is suitable for small and medium-sized road networks.The simulation results show that the algorithm can effectively reduce the average waiting time of vehicles in the road network,and the efficiency is as high as 55.98% compared to the fixed timing method.(3)For large-scale road network scenarios,a multi-agent road network signal optimization control algorithm is proposed to solve the problem that the state space and action space dimensions in large-scale road network scenarios are too huge for the reinforcement learning model to converge.The core of the algorithm is: each intersection is controlled by local intelligence.and the local agent not only senses the local traffic state,but also senses the overall traffic state by acquiring the global phase scheme;and the accumulated waiting time of the adjacent intersection is added to the reward function of the agent,so that the agents at the intersection can coordinate and efficiently find the best road network signal control scheme.The algorithm in this paper has good scalability and can be extended to scenarios of various scales,while the computational cost of the agent will not change accordingly.Simulation results show that,under large-scale road network scenarios,the algorithm can effectively reduce the average vehicle waiting time by 78.26% and 32.98% compared to the fixed timing method and the central agent algorithm.The research in this paper,from the perspective of designing intelligent traffic signal control algorithms,further taps the supply potential of existing traffic systems in regard of different scenarios.Simulation experiment results show that the algorithm proposed in this paper can effectively reduce the average waiting time of vehicles in the area,alleviate urban congestion,and improve the personal driving experience in the scenario of single intersection,small-scale road network,large-scale road network and real road network data,which makes a certain contribution to the promotion of intelligent urban transportation. |