| With the development of artificial intelligence technology,intelligent control technology is widely used in the military field,and in the simulation of future air combat,intelligent control of air combat engagement is the core problem of studying future air combat tactics.In this thesis,we take UCAV air combat simulation as the research object,and study the application of deep reinforcement learning theory to the autonomous decision making simulation of air combat,realize the intelligent control of UCAV using deep reinforcement learning algorithm,and build a simulation system to demonstrate the effect of engagement,and build a 3D virtual scene of air combat simulation to enhance the realism and immersion of air combat simulation,based on which we develop a mixed reality based UCAV air combat simulation system under the environment.The main research work of this thesis is as follows.Firstly,the optimization algorithm is studied in the application of multi-objective allocation in UCAV cooperative air combat,and the improved Grey Wolf Optimizer(GWO)is used to reasonably allocate multi-objectives in UCAV cooperative air combat.By considering the air combat situation information and aircraft performance of both sides,the mathematical modeling of the UCAV multi-target allocation problem is carried out to establish the comprehensive advantage function of the own side over the enemy side,and then the optimal allocation scheme is solved using the Grey Wolf Optimizer to maximize the global comprehensive advantage function,and the discrete Grey Wolf algorithm is designed for the UCAV air combat multi-target allocation problem.The algorithm adopts a two-headed wolf mechanism and an adaptive step mechanism to improve the gray wolf algorithm,which improves the problems of premature convergence and easy to fall into local optimum of the original algorithm,so as to complete the multi-objective allocation process of UCAVs.Then,deep reinforcement learning algorithms are investigated in the application of intelligent control of UCAVs,and the Deep Deterministic Policy Gradient(DDPG)of deep reinforcement learning algorithms is used to perform intelligent control of UCAVs using the Actor Critic(AC)framework,and The algorithm idea of centralized critique and decentralized execution is used to establish the global Critic network and individual Critic network to realize the multi-UCAV engagement control in the simulation,while the corresponding reward function is designed according to the characteristics of air combat simulation control in terms of UCAV altitude,angle,speed and target assignment scheme,and the action input model of UCAV is established based on the flight dynamics equation of the vehicle.The reward function is divided into internal and external rewards to achieve the interaction function with the environment and teamwork function.The external reward is used to calculate the integrated reward of environmental feedback as the evaluation index of UCAV-environment interaction,and the internal reward is calculated by the Generative Adversarial Imitation Learning(GAIL)algorithm in imitation learning.GAIL)to calculate as an evaluation metric for the similarity between the behavioral trajectory of the UCAV and the expert trajectory,and to solve the cold start problem of slow training speed due to sparse rewards in the pre-training period.Finally,the Tensor Flow-based deep reinforcement learning environment is connected to the Unity3D-based virtual reality environment,and a communication program is established to output the neural network of the deep reinforcement learning environment to the Unity3 D environment to control the UCAV to perform maneuvers,and the feedback information is transmitted to the neural network for updating the network parameters after the UCAV performs the maneuvers,and the design UI interaction is written The interface and the human-computer interaction mechanism that combines reality and reality control the system operation.Combine mixed reality technology,C# programming language,Python programming language to integrate multiple modules such as algorithm control module,virtual battlefield module and human-computer interaction module to develop a mixed reality-based UCAV air combat simulation system,and package the system into a common platform application to be released to Holo Lens 2 for 3D air combat process visualization simulation,and verify the system on the basis of this Feasibility. |