Font Size: a A A

Research And Implementation On Multi-traffic-light Control Policy Based On Deep Reinforcement Learning

Posted on:2022-10-03Degree:MasterType:Thesis
Country:ChinaCandidate:H P ZhangFull Text:PDF
GTID:2492306740491924Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of Chinese economy,household owning cars are explosive increased,which worsen the problem of ubran traffic congestion.The intelligent taffic light control is critical way to improve the efficiency of transportation system.Conventional traffic light control approach is mostly designed by hand-crafted rules,it is unable to adjust policy adaptively for dynamic traffic situation.Because of the ability of deep reinforcement learning approach on handling the complex traffic situation,there is an emerging trend of using deep reinforcement learning for traffic light control.Currenet traffic light control based on deep reinforcement learning could be classified in two categories: single-agent modeling approach and multi-agent modeling approach.Single-agent modeling approach is a developed approach.Current single-agent modeling approach require a data buffer to cache history records for each agent.The memory cost of maintaining data buffer will be the bottleneck with traffic system growth.Therefore,this thesis proposes a traffic light control algorithm based on on-policy learning approach,which could learn effective traffic light control policy with no need for extra memory.To solve the lower smaple efficieny problem in original on-policy learning,proposed apporch improve algorithm by proximal policy optimization(PPO)which could increase the sample efficiency.PPO proposes a gradient clipping method based on the idea of importance sampling.The clipping method control the direction and range of updating gradient by restrain difference ratio between parameters.It enables samples could be resued more than once to update policy effectively without breaking theoretical property of on-policy learning.Single-agent modeling approach is an independent learning method without considering cooperation among agents.This approach optimizes system utility indirectly by optimizing local utility.Multi-agent approach learns cooperative policy by optimizing system utility directly.However,conventional multi-agent approach suffers from curse of dimensionality.Therefore,this thesis proposed a cooperative traffic signal control approach.Proposed approach introduced centralized training with decentralized execution learning pattern.Algorithm follows this pattern only depends on local information during policy execution which lower the state and action space.Meanwhile,it utilizes the system information and optimize global utility when training that lead agent to learn better cooperative policy.With ensuring the learning ability of the algorithm,this pattern alleviates curse of dimensionality problem.This thesis evaluates the proposed single-agent modeling approach on numerous transportation system setting.The experimental results show proposed method could achieve similar or better result compared with baseline methods.Multi-agent modeling alao be evaluated in various setting.The experimental result shows it could achieve better result in small and midium transportation system than single-agent modeling approach.The performance improvement becomes more obvious with the increase of the complexity of the transporation system.The last contribution of this thesis is design and implement a traffic signal control policy learning system.User could upload the transportation system setting and assign learning algorithm to create light control policy learning task.
Keywords/Search Tags:traffic light control, deep reinforcement learning, multi-agent system
PDF Full Text Request
Related items