The Study Of Reinforcement Learning Towards Urban Self-adaptive Traffic Signal Control Environment

Posted on:2014-11-19

Degree:Doctor

Type:Dissertation

Country:China

Candidate:X H Xia

Full Text:PDF

GTID:1262330425976752

Subject:Traffic Information Engineering & Control

Abstract/Summary:

PDF Full Text Request

Due to the rapid development of urban traffic and the increase of urban roadâ€™s functionand density, foreign scholars have began the research of adaptive traffic signal control.Adaptive traffic signal control is a potential approach to alleviate congestion. Urbantransportation system has the characteristics of nonlinearity, dynamics, uncertainty, fuzziness,and complexity etc., so the traditional adaptive traffic signal control system and intelligentcontrol method cannot be adapted to the variation of traffic flow in a certain extent andseriously rely on the traffic model although they have obtained certain achievements.Reinforcement learning(RL) needs less the mathematical model and the priori knowledge ofexternal environment, so it can achieve good learning performance in large space andcomplicated nonlinear system. Then, agent-based RL proposed by many scholars will havebroad prospects for development in the adaptive traffic signal control. The study employs atraffic signal control agent for each signalized intersection. On the analysis of standardreinforcement learningâ€™s process and effectiveness under adaptive traffic signal control,applications of several typical reinforcement learning algorithms on the adaptive trafficcontrol have been studied, including the distributed Nash Q-learning algorithm,multi-interactive history learning algorithm and policy gradient ascent algorithm. The focusand the innovation achievements of the thesis are as follows:(1) Construction of the system structure model for intersectionâ€™s traffic signal controlagentDue to more interference, dynamic and uncertainty of the intersectionâ€™s traffic flow, thehybrid system structure model for intersectionâ€™s traffic signal control agent was establishedby the fusion of cognitive and reactive agent structure based on agentâ€™s BDI theory modelaccording to the "perception-cognition-behavior" mode.(2)The realization of standard reinforcement learning algorithm toward adaptive trafficsignal controlUse a independent standard reinforcement learning method such as Q-learning forintersection traffic signal control,and the realization process of Q-learning algorithm wasanalyzed. Compare with traditional timing control method, the Q-learning was effective.Aimed at dimension disaster problems of independent standard reinforcement learning algorithm,the independent standard reinforcement learning algorithm was extended byintroducing coordination mechanism. Compared with the independent standard reinforcementlearning, the convergence and effectiveness of coordination-based standard reinforcementlearning was analyzed.(3)The design of distributed Nash Q-learning algorithm toward adaptive traffic signalcontrolAccording to the mutual relevance of traffic flow between intersection, mathematicalmodel of interaction for intersectionâ€™s traffic signal control agents was built based on nonzero-sum Markov game, and the distributed Nash Q-learning algorithm to solve the modelwas put forward. In the proposed algorithm, each intersectionâ€™s traffic signal control agentselects action according to not only its own Q-values but also the Q-values of otherintersectionsâ€™ traffic signal control agents. The selected action is the Nash equilibrium ofQ-values of all the current intersectionsâ€™ traffic signal control agents. This method let eachintersectionâ€™s traffic signal control agent learn to update its Q-values under the joint actionsand imperfect information. Theoretical analysis and simulation experiment results show thatthe method is convergent. Compare with independent reinforcement learning algorithm, fixedtiming control, and foreign relevant literatureâ€™s algorithms, its effectiveness was analyzed.(4)The design of multi-interactive history learning coordinated algorithm towardself-adaptive traffic signal controlIn view of the deficiency of the hypothesis of complete knowledge and single interactionin the present application muti-agent-based learning coordination mechanism forself-adaptive traffic signal control, multi-interaction mathematical model for intersectionâ€™straffic signal control agents was built based on game theory, and a multi-interactive historylearning algorithm was constructed by introducing memory factor. In the proposed model andalgorithm, each intersectionâ€™s traffic signal control agent plays the coordination game with itsneighbors and update its mixed strategy according to the getting payoff, and it takes allhistory interactive information which comes from neighbouring intersectionâ€™s traffic signalcontrol agents into account. The learning rule assigns greater significance to recent than topast payoff information. The convergence of the approach was analyzed theoretically. Howthe parameters such as memory factor, learning probability, the local traffic changeprobability etc. will affect the algorithmâ€™s performance was analyzed.Compare with foreign relevant literatureâ€™s method by an experiment of coordinated control for main intersections inthe arterial, the result indicates that this method is effective.(5)The design of policy gradient approach toward self-adaptive traffic signal controlAs the status information of urban traffic system environment is difficult to completelyperceived by control system, the self-adaptive traffic signal control was seen as POMDPï¼ˆPartially Observable Markov Decision Process) problem, and the POMDP environmentmodel of intersection self-adaptive traffic signal control was established. Based on theintroduced GPOMDP algorithm, the shortage of the general policy gradient estimationapproach, the OLNAC algorithm for self-adaptive traffic signal control was designed by thefusion of natural gradient, value function method. How the related parameters will affect thetow algorithmâ€™s convergence was analyzed by simulation experiment. Compared withSAT(saturation-balancing technique), uniform technique, random technique, and foreignrelevant literatureâ€™s method,the proposed algorithms are effective and has a certainapplicability to solve the self-adaptive traffic signal control.

Keywords/Search Tags:

Reinforcement Learning, Self-adaptive Traffic Signal Control, Game Theory, Markov Decision Process

PDF Full Text Request

Related items

1	Research On Traffic Signal Light Control Algorithm Based On Visual Information Mapping
2	Research On Decision And Control Algorithm For Vehicle Full-Speed Adaptive Cruise Based On Learning Control
3	Research On Coordinated Control Of Crunk Traffic Signal Based On DQN And Game Theory
4	The Research For Adaptive Urban Traffic Signal Control Method Based On Reinforcement Learning
5	Mixed Platoon Control Strategy With Multiple Objectives Optimization: A Reinforcement Learning Based Approach
6	A Study On Adaptive Traffic Signal Coordinated Timing Strategy Of Urban Road Network Based On Deep Reinforcement Learning
7	Research On Adaptive Communication Jamming Technology Based On Physical Layer Parameter Decision-Making
8	The Research Of Urban Traffic Signal Control Based On Multi-Agent System
9	Research And Application Of Multi-Agent Reinforcement Learning In Traffic Signal Control
10	Massive Scheduling Method Under Online Situation For Satellites Based On Reinforcement Learning