Font Size: a A A

Research On Multi-UAV Cooperative Control Algorithm Based On Reinforcement Learning

Posted on:2024-08-27Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhangFull Text:PDF
GTID:2542307088963259Subject:Mechanical and electrical engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of artificial intelligence and unmanned aerial vehicle(UAV)technology,the application of multi-UAV systems in military and civilian fields is becoming increasingly widespread.Especially in the military field,multi-UAV systems need to have high autonomy,collaboration,and adaptability to deal with extremely complex and uncertain environments.In this context,this paper introduces reinforcement learning technology into the multi-UAV cooperative control,aiming to solve the problems of difficult convergence,poor cooperation among UAVs,and low task completion rate in reinforcement learning-based multi-UAV cooperative control.The improved algorithm is verified through simulation experiments.The main contributions of this paper are as follows:1.A multi-UAV simulation platform software is developed to test the application of multi-UAV systems in battlefield operations.The software has the ability to simulate red-blue military confrontation,and the strategy interface of the software can access various strategies to control the multi-UAV system and demonstrate the control effect of different strategy algorithms.The software can present excellent demonstration effects and provide an important basis for subsequent algorithm training and simulation demonstrations.2.A reinforcement learning algorithm RND-MADDPG based on random network distillation is proposed to address the problems of low efficiency and repetitive search in the MADDPG algorithm during the search process.The algorithm introduces the curiosity mechanism of random network distillation to optimize the process of agent experience exploration and improve the exploration efficiency of the agent by prioritizing exploration in unknown areas.Simulation experiments on the simulation platform software demonstrate that the RND-MADDPG algorithm can significantly improve exploration efficiency and convergence performance compared to the MADDPG algorithm,and the task success rate is improved by 6%.3.In response to the problems of long training time,insufficient use of hardware resources,and slow convergence speed of MADDPG algorithm,this paper proposes a reinforcement learning algorithm called PAR-MADDPG based on parallel training and multi-level replay buffer.The algorithm draws on the parallel experience exploration idea of A3 C algorithm,fully utilizes the computing resources of multi-core CPU,and constructs a multi-level experience pool,so that experiences with different values are partitioned,stored,and replayed with priority.Experimental results show that the training time of PAR-MADDPG algorithm is reduced to 1/7 of the original,and the task success rate is improved by 8% compared to MADDPG algorithm.At the same time,PAR-MADDPG algorithm shows good convergence speed and stability.4.In view of the problem that the action space and state space dimension of the agent are too large,resulting in information redundancy and difficulty in learning key information,this paper proposes a reinforcement learning algorithm called ATT-MADDPG based on multi-head attention mechanism.The algorithm uses attention mechanism to extract effective information and further improves the cooperation between intelligent agents.Experimental results show that compared with the baseline algorithm,the ATT-MADDPG algorithm obtains higher rewards and improves the task success rate by 14% compared to MADDPG algorithm.
Keywords/Search Tags:Reinforcement Learning, Cooperative Control, Swarm Intelligence, Attack-defense Countermeasure, Deep Reinforcement Learning
PDF Full Text Request
Related items