| In recent years,with the steady development of national economy,the civil aviation transportation industry of our country has also gained great progress.However,the increasing volume of air traffic leads to the increasing strain of limited airspace resources,coupled with the adverse effects of random factors such as severe weather,resulting in traffic congestion in a large scale across the country.Therefore,it is urgent to study the traffic congestion airspace rerouting planning.In this thesis,the spatial environment models and rerouting algorithms are introduced in detail,and the content of reinforcement learning is elaborated in detail.Secondly,reinforcement learning method and artificial potential field method are used to study the rerouting route planning of traffic congested airspace.The airspace environment is rasterized and the grid types are classified,combined with the actual running situation of the aircraft to improve the incentive function of Markov decision process,and then use the Q-Learning algorithm to follow the greedyε-action strategy implementation of rerouting planning.In order to explore the best value of ε,using control variable method to set different values is simulated respectively,and its path compared with the result of artificial potential field method.Then,on the basis of the original airspace environment model,considering the influence factor of severe weather,we continue to implement the rerouting strategy.Graham scanning method is used to delimit the regional boundary of the recognized radar echo map of severe weather,and then it is rasterized and superimposed with the original airspace environment to build a new airspace environment model.Then through MATLAB,respectively using the Q-Learning algorithm based on greedyε-strategies and artificial potential field method for rerouting planning,and finally the simulation experiment results are analyzed to select a suitable diverting route.Data shows that in different spatial environment models using these two kinds of algorithms can search out from starting point to destination and avoid congested waypoints and severe weather regions diverting routes,but diverting route simulation performance index data is better,which is generated by Q-Learning algorithm based on the greedyε-action strategy. |