Font Size: a A A

Research On Intelligent Cooperative Strategies Of UAV Swarm For Adversarial Tasks

Posted on:2024-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:H Y SunFull Text:PDF
GTID:2530307061468194Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of artificial intelligence technology,autonomous decisionmaking of unmanned aerial vehicles(UAVs)has become the core content of modern intelligent aerial warfare,and is the most researched topic in the military field today.This paper is based on game theory,combined with computational methods such as deep reinforcement learning,and conducts in-depth research on intelligent collaborative strategies of unmanned aerial vehicles under different combat conditions.The key research contents include:For one-on-one close-range air combat scenarios of UAVs in dynamic environments,the game situation is constantly changing and the decision-making processe is complex.A MinimaxDDQN intelligent decision-making method with immune guidance strategy is proposed.By combining reinforcement learning and game theory,the system’s autonomous learning ability and ability to deal with complex state features are improved.Firstly,DDQN neural network is trained through a combination of immune guidance strategy and Minimax strategy to output the optimal strategy in real time against the enemy maneuvers.Secondly,the problem of correlation between training samples is solved through the experience replay technology to improve training efficiency.Finally,a three-dimensional space air combat simulation environment is constructed.The simulation results show that the proposed algorithm’s success rate is 60% and 33%respectively higher than traditional DDQN and Minimax-DQN algorithms when pursuing an enemy aircraft with straight maneuver.Moreover,when competing against the enemy aircraft using Minimax-DDQN algorithm,the final winning rate can reach 60%,demonstrating the validity and superiority of the proposed decision-making algorithm.For the three-body antagonistic scenario in which the defensive strategy of the enemy is unknown and the UAV should take the initiative to defend in dynamic environments,traditional methods require obtaining the enemy’s control strategy in advance.Therefore,this paper proposes a adaptive heuristic dynamic programming algorithm(AHDP)based on the differential game model to solve the offensive and defensive strategies.Firstly,a differential game model is constructed for the three-body attacking and defending problem of attacking unmanned aerial vehicles-target-defensive unmanned aerial vehicles.The confrontation scenarios include zerosum and non-zero-sum games.Secondly,the attacking and defending unmanned aerial vehicles are designed with a heuristic adaptive dynamic programming algorithm to solve the optimal performance index function by using dynamic confrontation information iteratively to update the parameters of the evaluation network and execution network.Simulation results show that the AHDP method has better accuracy and efficiency than traditional PN,APN,and ADP methods under three attack states of the lateral,frontal,and tail-chasing,the target unmanned aerial vehicle and the defensive unmanned aerial vehicle have good coordination,and the defensive unmanned aerial vehicle can successfully intercept the attacking unmanned aerial vehicle to protect the target unmanned aerial vehicle acting as a bait in the antagonistic game.For the scenario of UAVs attack and defense between large-scale swarms,a neuralendocrine-immune mechanism-inspired random game model(NISG)is proposed,and the USPPO algorithm is used to solve the UAV swarm’s confrontation strategy.The model is inspired by biological mechanisms to aggregate the situational information perceived by UAVs in the swarms confrontation system,construct the random game behavior between UAVs,and effectively improve the efficiency of the swarm’s attack and defense.Based on the idea of reinforcement learning,a USPPO learning algorithm suitable for UAV swarms is proposed,which is used to generate confrontation strategies for large UAV swarms through centralized training and distributed execution.Simulation results show that the algorithm can effectively simulate complex emergent behaviors of the swarm and obtain higher average rewards than the MADDPG and improved MAPPO learning algorithms,with a win rate of 74.6% in confrontation with the improved MAPPO,thereby better meeting the needs of large-scale UAV swarms confrontation.
Keywords/Search Tags:UAV swarms confrontation strategies, game theory, deep reinforcement learning, adaptive heuristic dynamic programming, immune intelligence
PDF Full Text Request
Related items