| Artificial intelligence technology will play an important role in the future intelligent operations,and the self-decision technology based on multi-agents is the focus of intelligent operations.As a typical form of combat,multi-UAVs combat is a research hotspot due to its uncertainty and complexity.Through the methods of multi-UAV combat mission analysis,multi-agent system modeling and combat simulation,this paper studies the deep reinforcement learning algorithm for multi-UAV combat missions.The research contents are as follows:1.We expound the knowledge background and research value of multi-UAV combat tasks,and introduce the multi-agent deep reinforcement learning methods and their application fields in detail.In addition,we apply DRL’s perception and decision-making ability to multi-UAV combat tasks,and propose a deep reinforcement learning method for multi-UAV combat missions.2.In this paper we propose two efficient training techniques for improving the performance of MARL algorithms in multi-UAV combat problem.The first one is the scenario-transfer training,which utilizes the experience obtained in simpler combat tasks to assist the training for complex tasks.The next one is the self-play training,which can continuously improve the performance by iteratively training agents and their counterparts.3.There are rich human experiences in the field of multi-UAV combating.We propose a rule-coupled reinforcement learning method.We abstract human experience into tactical rules,which can be used to guide the learning process of multi-agent.It can reduce invalid exploration to improve training speed of algorithm and enhance the ability of the agent.4.DRL algorithms require a large amount of calculations in the multi-agent environment,resulting in slow convergence.We propose a parallel approach to multiagent deep reinforcement learning.We parallelize the training process of multi-agent deep reinforcement learning,which is helpful for quickly correcting hyper-parameters,making full use of computing resources and reducing the time required for training. |