Font Size: a A A

Research On Multi-agent Coverage Control Based On Deep Reinforcement Learning

Posted on:2023-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:S C LiuFull Text:PDF
GTID:2568306848467344Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,with the continuous development of artificial intelligence technology and the continuous deepening of the field of agents,the field of single-agent deep reinforcement learning has produced many excellent results.In order to solve more complex team tasks,deep reinforcement learning is gradually introduced into the multi-agent category.In a multi-agent system,the multi-agent coverage control task is one of the most typical multi-agent coordination problems.For this,traditional single-agent deep reinforcement learning algorithms are difficult to converge effectively,and the learning efficiency and effect are poor.This paper studies the multi-robot handling system in the intelligent manufacturing workshop,abstracts it as a multi-agent coverage control problem with multiple fixed warehouse points,and conducts the following research:Firstly,in view of the problem that the training results of deep reinforcement learning in the application of multi-agent coverage control systems cannot be smoothly converged,a deep reinforcement learning algorithm IAAC(Improve Adam Actor Critic)based on the improved Adam optimizer for proximal policy optimization is proposed.First,the sample pooling mechanism is used to store the results of the agent’s interaction with the environment and provide corresponding samples in multi-agent training.Then,through centralized learning and decentralized execution,combined with the AC framework(Actor-Critic)to train a centralized critic network that can approximate the true value Update of parameters.Finally,a fast and slow weight parameter update is introduced in the gradient descent process to improve the convergence effect and learning efficiency.Then,in view of the complexity of the multi-agent deep reinforcement learning algorithm and the excessive calculation of the feature matrix,a multi-agent coverage control task model LSA-MAL(Linear Soft Max Attention Multi-Agent Landmark).First,a multi-head attention mechanism is introduced to map the input features into three features,and a new feature is obtained through partial point multiplication weighted mapping.Then,the derivation formula of the Soft Max layer is linearized and expanded,and the expansion formula is partially normalized,so that the linear expansion can simulate the original Soft Max mapping function.Finally,an improved linear formulation is used instead of the Soft Max layer for multi-head attention.Finally,a simulation environment based on multi-agent particles is constructed,and the deep reinforcement learning algorithm IAAC based on the improved Adam optimizer for proximal policy optimization and the multi-agent coverage control task model LSA-MAL based on the improved Soft Max layer multi-head attention mechanism are tested.Comparative experiments and analysis.The experimental results demonstrate the effectiveness of the improved algorithm and model,which can improve the training speed and convergence of the agent while maintaining the success rate.
Keywords/Search Tags:multi-agent system, coverage control, proximal policy optimization, attention-mechanism, actor-critic algorithm
PDF Full Text Request
Related items