Font Size: a A A

Research On Collaborative Search Of UAV Group Based On Multi-agent Reinforcement Learning

Posted on:2022-11-05Degree:MasterType:Thesis
Country:ChinaCandidate:K LiuFull Text:PDF
GTID:2492306764464924Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
The problem of swarm search and area coverage has a wide range of applications in military reconnaissance,swarm strike,intelligent plant protection,environmental exploration,etc.,and has been widely studied in military and civilian fields.The paths planned by the existing area search and coverage algorithms do not fully take the dynamic changes of the swarm and the development of the environmental situation in to consideration,thus the optimal solution under multiple constraints and emergencies cannot be obtained.This paper takes the unmanned aerial vehicle(UAV)swarm search problem as the research topic,and focuses on the UAV swarm search algorithm based on deep reinforcement learning.The main research contents are as follows.The development process from Markov process to the value decomposition problem in multi-agent deep reinforcement learning problem is combed,and it is pointed out that the existing reinforcement learning based algorithms on swarm search are mainly evolution of single-agent methods,lacking the latest methods based on value decomposition problem.Therefore,neither of the existing algorithms can be used in a distributed decision-making environment.A swarm search scheme based on sequential decision-making is firstly proposed in this paper in order to smoothly transition from a single agent system to a multi-agent system.It has fully thought about the design of reward function and environmental exploration,and cleverly used the ambiguity of state space in the time domain to achieve sample interoperability between different agents,and broke the mindset of that the reward design is positively or negatively correlated with the task.The convergence stability of the improved scheme is better,which is verified by the simulation comparison.On the basis of the swarm search scheme based on sequence decision,this paper further proposes a swarm search scheme based on distributed decision,and elaborates the difference between the above two in environmental modeling.The disadvantage and reasons of the slow learning speed of the existing non-monotonic value decomposition algorithm are demonstrated in detail through the combination of graphics and text from the algorithm level,and a fast value decomposition algorithm is proposed.The simulation test proves that the swarm search environment design belongs to the non-monotonic value decomposition problem,and the new algorithm has a faster learning speed than the original algorithm.Finally,this paper designs a multi-UAV swarm physical verification platform based on the robot operating system and wireless routing communication.Based on this system,the software-in-the-loop verification and physical flight verification of the UAV swarm cooperative search are given,which further confirms that the swarm search scheme based on sequence decision has priority and robust than the existing optimization algorithm.
Keywords/Search Tags:Unmanned Aerial Vehicle Swarm, Deep Reinforcement Learning, Multi-agent System, Swarm Search, Area-coverage Path Planning
PDF Full Text Request
Related items