Font Size: a A A

Function Approximation Type Reinforcement Learning Models For Signal Timing Optimization Of A Single Intersection

Posted on:2018-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:T P WangFull Text:PDF
GTID:2382330548974742Subject:Engineering
Abstract/Summary:PDF Full Text Request
The rapid development of transportation has brought great convenience to people's life.However,it also causes a series of traffic problems.Traffic congestion has become a bottleneck of the sustainable development of cities.Newly built and expanded roads can increase the capacity of road network to ease traffic congestion,but are limited by urban land resources.Optimizing traffic control can reduce traffic delays in unsaturated traffic flow.The existing adaptive control model uses the heuristic algorithm to optimize and can only get the local optimal solution.With the development of artificial intelligence,intelligent algorithm has more adaptability and generalization ability,which provides opportunities for improving traffic control model.In this paper,an adaptive traffic control model is established by using reinforcement learning theory.First we introduced the principle of reinforcement learning,focusing on the Q learning algorithm and reinforcement learning algorithm based on neural network approximation.Then,taking the delay as the evaluation index of the signal timing,two models with the goal of minimizing the delay are proposed,which are online Q learning model based on discrete state and online Q learning model based on neural network approximation respectively.The former uses the matrix to store value function,which overcomes the problem of "dimensionality disaster" through discretizing state of traffic flow.This kind of discrete processing is equivalent to a generalization.The latter uses multiple feedforward neural networks with the same structure to approximate the behavioral value function,which realizes the estimation of the unknown traffic flow state,and has better generalization ability.The performance of these two models is verified by an integrated simulation platform combining Vissim,Excel VBA and Matlab.The simulation results show that both models can obtain the convergent Q matrix and the optimal signal timing scheme under each traffic flow condition.The online Q learning model based on neural network approximation is superior to the online Q learning model based on state discretization in terms of delay metrics.Therefore,a reinforcement learning model combined with neural network can impro've traffic control performance.
Keywords/Search Tags:BP neural network approximation, behavior selection strategy, Q learning algorithm, Integrated simulation platform, Signal timing optimization
PDF Full Text Request
Related items