Font Size: a A A

Research On Longitudinal Trajectory Planning Algorithm For Connected And Automated Vehicle In Mixed Traffic Flow Based On Reinforcement Learning Theory

Posted on:2023-02-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y Q ChengFull Text:PDF
GTID:1522307028957319Subject:Transportation planning and management
Abstract/Summary:PDF Full Text Request
With the rapid development of China’s transportation industry,the scale of transportation infrastructure is becoming larger,and the number of new motor vehicles every year ranks among the top in the world.However,the existing urban road traffic supply level is still difficult to meet the rapidly growing travel demand,and a series of problems caused by traffic congestion need to be solved urgently.The characteristics of slow response,error prone and randomness of human drivers in road traffic flow have become the direct cause of traffic problems.With the development of the fifth generation mobile communication technology and the update of automatic driving technology,connected and automated vehicle with above two advantages,such as "information interconnection" and "automatic driving",will soon become an important part of intelligent transportation system.To reduce the uncertain factors in the driving process of manually driven vehicles by using CAV trajectory planning has become a new direction to solve road traffic problems.Two types challenges that CAV trajectory planning tasks should be faced for in the early stage of the technology use in the future as follows: 1)There are the uncertainty of manual driving vehicles and the complex environmental characteristics caused by the nonlinear superposition of this uncertainty in the mixed traffic flow.It will directly improve the difficulty of CAV decision-making and prolong the calculation time of its optimal behavior.2)The impact between the two types of vehicles is not unilateral.The interaction between them and their subsequent impact will appear randomly and continuously in the mixed traffic flow,which also increases the difficulty of optimal behavior decision-making of CAV in the subsequent state.Therefore,it has important research significance for dealing with these challenges to further develop the ability of CAV for mixed traffic flow to improve traffic efficiency,reduce fuel consumption and improve safety level.Based on this,the main contents of this paper mainly include the following aspects:(1)Firstly,a decomposition method of basic interaction units is proposed to reduce the analysis difficulty of complex problems caused by the uncertainty of manual driving vehicle behavior and vehicle interaction in mixed traffic flow.On this basis,a mixed traffic flow environment model which can interact with CAV and feed back information in real time is constructed.The state characteristic information,internal vehicle operation dynamics equation,optimization objective function and boundary conditions have been contained and integrated into three layers of the environmental model,such as the information input layer,state update layer and information output layer of the.Mixed traffic flow environment model is the theoretical basis for further constructing CAV decision-making model in the framework of reinforcement learning theory.(2)This research proposed a finite Markov decision process(FMDP)model from the perspective of CAV decision-making.The trajectory planning task of CAV in mixed traffic flow is abstracted into a series of continuous decision-making processes.In this model,the continuous time interval isolation is dispersed into a time series,and the elements of the intermediate and terminal states in the CAV decision chain have also been defined.Then,starting from the behavior of the CAV,the FMDP state transition probability is defined according to the possible change of the system which achieved by CAV actions.Combined with different optimization objective functions,the relationship between the output benefit value of environmental model and FMDP reward value have been analyzed.Finally,the reward function which will be used in the decision-making model is defined by the optimization objective function of the mixed traffic flow environment model.The function of attenuation coefficient in the model is discussed.FMDP model is the theoretical basis for the development of following two online trajectory planning algorithms.(3)An on-line CAV trajectory planning method(MCTS-MTF)based on Monte Carlo tree search is proposed,which can make CAV flexibly deal with the complex environment in mixed traffic flow and decide the optimal trajectory in real time.In our research scenarios,the mixed traffic flow state,successor state and CAV behavior are abstracted into root node,leaf node and branch in the search tree of the proposed algorithm.The rapid multi-step simulation process derived from the environment model and the upper confidence bound function based on the balance coefficient setting method constitute the main structure of this algorithm.Then,the proposed tree expansion decision module and branch generation judgment conditions further improve the overall operation efficiency of the algorithm.Simulation tests verify the effectiveness and convergence of the algorithm for the three types optimization objective.In the sensitivity test,the algorithm can play a corresponding role in different background traffic volumes.(4)Another on-line CAV trajectory planning method(ADRL-MTF)based on deep reinforcement learning algorithm is proposed,which can further shorten the decision time of the optimal trajectory.In this method,the state value approximation network(Q network)is introduced to replace the huge state value storage table in the traditional solution algorithm.Four elements that have the greatest impact on vehicle behavior(current signal light color,remaining light time,vehicle remaining distance and vehicle instantaneous speed)are selected by the domain knowledge.They are the input element of the Q network.The reward value in the algorithm framework is analyzed,and the specific reward function is set for the behavior orientation of CAV in the training process.Simulation tests verify the effectiveness and real-time application potential of the algorithm.The stability test shows that this method is relatively stable for the change of initial headway distribution of traffic flow,and has a positive correlation with the change of CAV ratio in traffic flow.The sensitivity test shows that the proposed method is more sensitive to the change of speed limit value in long sections.
Keywords/Search Tags:Mixed traffic flow, Connected and automated vehicle, Longitudinal trajectory planning, Monte Carlo tree search, Deep reinforcement learning
PDF Full Text Request
Related items