Font Size: a A A

Flight Conflict Resolution Method Based On Deep Reinforcement Learning

Posted on:2022-07-13Degree:MasterType:Thesis
Country:ChinaCandidate:H WenFull Text:PDF
GTID:2492306551471154Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
In order to solve the issues of increasing congestion in the airspace and increasing air traffic flow,the Federal Aviation Administration of the United States put forward the concept of free flight.With the introduction of this concept,the problem of air traffic control has become particularly complex.With the vigorous development of my country’s civil air transportation industry and the increasing demand for civil aviation transportation,in order to ensure the safety of aircraft under free flight conditions,it is essential to conduct research on flight conflict resolution strategies.Existing flight conflict resolution methods include optimal control conflict resolution methods,probability conflict resolution methods,and mathematical programming conflict resolution methods.These traditional conflict resolution methods have issues such as low efficiency,excessive calculation,and poor real-time performance.This dissertation uses the advantages of deep reinforcement learning in solving sequential decision-making problems,combines deep reinforcement learning with flight conflict resolution tasks,and proposes a deep reinforcement learning-based flight conflict resolution method.The main research contents of this dissertation are as follows:(1)In view of the low efficiency of traditional conflict resolution methods and extravagant calculations,combined with the continuous state space and continuous action space characteristics of multiple aircraft in the conflict resolution process,the deep reinforcement learning algorithm DDPG(Deep Deterministic Policy Gradient)and SAC(Soft Actor-Critic)are used in a yaw flight conflict resolution mission,and simulation experiments are carried out.By analyzing the geometric configuration of the aircraft during flight,setting up the simulation environment required for this experiment,and combining the characteristics of the flight conflict resolution mission of a heading deflection,the state space and action space of the aircraft outside the airspace and the reward function considering the heading deflection are designed.In order to solve the problem of the slow convergence speed of the DDPG algorithm in the simulation experiment,this dissertation starts from the sampling strategy,combines the traditional DDPG algorithm with the sampling method of prioritized experience replay,and proposes an improved DDPG algorithm to improve the convergence speed.The experimental results show that compared with the DDPG algorithm,the improved DDPG algorithm has a faster convergence speed.At the same time,the advantages and disadvantages of the improved DDPG algorithm and the SAC algorithm in the flight conflict resolution mission are compared,and the results show that their deflection angle and release time are excellent.Finally,the improved DDPG algorithm is compared with the traditional flight conflict resolution algorithm based on mixed-integer linear programming(Mixed-integer Linear Programming,MILP).The results show that the improved DDPG algorithm and the MILP algorithm have smaller deflection degrees,and the improved DDPG algorithm is more efficient.(2)In order to further optimize the flight conflict resolution trajectory,the flight conflict resolution strategy of one heading deflection is extended to several heading deflections.In this dissertation,the improved DDPG algorithm and the SAC algorithm are respectively applied to the multi-directional deflection flight conflict resolution mission,and simulation experiments are carried out.First of all,considering the characteristics of the conflict resolution missions,a new state space and action space are designed.Specifically,the aircraft coordinates are added to the state space,and the aircraft’s heading angle deflection constraint in the airspace is added to the action space.Secondly,according to the requirements of multiple course deflection conflict resolution tasks,the reward reshaping function is designed from the two aspects of angle deflection and distance from the destination.Compare it with the sparse reward method design reward function,analyze their differences in training results and training efficiency,summarize the design ideas of the reward function in the multi-heading deflection flight conflict resolution mission,and improve the training efficiency of the algorithm and the quality of the trajectory.Then,by comparing the difference in training results and training efficiency of learning rates with different values,the value of the learning rate of the SAC algorithm in the task is determined.Finally,the simulation experiment results of the improved DDPG algorithm and the SAC algorithm are compared.The results show that both algorithms can effectively resolve conflicts.The improved DDPG algorithm has a smaller deflection angle and less time to release,and the SAC algorithm has better stability and short trajectory.
Keywords/Search Tags:Deep reinforcement learning, Flight conflict resolution, DDPG algorithm, SAC algorithm, Prioritized experience replay
PDF Full Text Request
Related items