Reinforcement Learning Based Multi-Agent Path Finding

Posted on:2024-09-18

Degree:Master

Type:Thesis

Country:China

Candidate:C Zhao

Full Text:PDF

GTID:2568306932462274

Subject:Cyberspace security

Abstract/Summary:

PDF Full Text Request

With the development of artificial intelligence,multi-agent systems have been widely used in the military,logistics,rescue and other fields.Multi-agent path finding(MAPF)is the basis of multi-agent systems,so has important research value.MAPF aims to find paths for multiple agents from the start positions to the goal positions without any conflict.Classical MAPF methods can only applied in known and fixed environments,and the efficiency of path planning is not very high.Due to its good autonomous learning capability,reinforcement learning is widely used in various automatic systems with good environmental adaptability.Inspired by this,this thesis investigates reinforcement learning-based MAPF methods,aiming to improve the learning efficiency and environmental adaptability.The main work of this thesis is as follows:(1)To address the problem of sparse rewards,a MAPF method based on curriculum learning is proposed,which decomposes the MAPF task into sub-tasks from easy to difficult,alleviating the impact of the sparse rewards on the performance of the method and increasing the learning efficiency of the method.The method arranges three curriculums for the task,using intensive individual rewards in the first two curriculums to make the exploration more directed,and team rewards in the final curriculum to generate cooperative strategies.Experiments are conducted on random obstacle grid worlds and the results show that the method proposed outperforms state-of-the-art learning-based methods,especially in complex environments with high obstacle density.(2)To address the problem of exploding policy space,a MAPF method based on sequence model is proposed,which intuitively reduces the policy space from exponential to linear level and improves the scalability of the method by transforming the MAPF problem into a sequence decision problem.The method establishes a sequential decision paradigm,where agents select actions according to observations and actions of precursor agents in a certain order.Experiments are conducted on random obstacle grid worlds and the results show that the method proposed has a considerable lead over existing learning-based methods in environments with a large number of agents.

Keywords/Search Tags:

Multi-Agent Path Finding, Multi-Agent Reinforcement Learning, Cur-riculum Learning, Sequence Model

PDF Full Text Request

Related items

1	Local Attention-Cooperated Reinforcement Learning For Multi-agent Path Finding
2	Research On Path Planning Based On Multi-Agent Cooperative Communication Learning
3	Research On AGV Path Planning Based On Cooperative Multi-agent Reinforcement Learnin
4	Research On Deep Reinforcement Learning Technology For Multi-agent Collaboration
5	Research On Key Technologies Of Multi-agent Cooperation Problems Based On Reinforcement Learning
6	Multi-agent Collaborative Path Planning And Formation Encircling Based On Reinforcement Learning
7	Research On Multi-Agent Path Planning Based On Deep Reinforcement Learning
8	Research On Multi-agent Cooperation Method Based On Deep Reinforcement Learning
9	Research On The Key Technology Of Multi-agent Collaborative Algorithm Based On Deep Reinforcement Learning
10	Research On Multi-Agent Deep Reinforcement Learning Methods And Applications