Research On Multi-agent Coverage Control Based On Deep Reinforcement Learning

Posted on:2023-04-18

Degree:Master

Type:Thesis

Country:China

Candidate:S C Liu

Full Text:PDF

GTID:2568306848467344

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

In recent years,with the continuous development of artificial intelligence technology and the continuous deepening of the field of agents,the field of single-agent deep reinforcement learning has produced many excellent results.In order to solve more complex team tasks,deep reinforcement learning is gradually introduced into the multi-agent category.In a multi-agent system,the multi-agent coverage control task is one of the most typical multi-agent coordination problems.For this,traditional single-agent deep reinforcement learning algorithms are difficult to converge effectively,and the learning efficiency and effect are poor.This paper studies the multi-robot handling system in the intelligent manufacturing workshop,abstracts it as a multi-agent coverage control problem with multiple fixed warehouse points,and conducts the following research:Firstly,in view of the problem that the training results of deep reinforcement learning in the application of multi-agent coverage control systems cannot be smoothly converged,a deep reinforcement learning algorithm IAAC(Improve Adam Actor Critic)based on the improved Adam optimizer for proximal policy optimization is proposed.First,the sample pooling mechanism is used to store the results of the agent’s interaction with the environment and provide corresponding samples in multi-agent training.Then,through centralized learning and decentralized execution,combined with the AC framework(Actor-Critic)to train a centralized critic network that can approximate the true value Update of parameters.Finally,a fast and slow weight parameter update is introduced in the gradient descent process to improve the convergence effect and learning efficiency.Then,in view of the complexity of the multi-agent deep reinforcement learning algorithm and the excessive calculation of the feature matrix,a multi-agent coverage control task model LSA-MAL(Linear Soft Max Attention Multi-Agent Landmark).First,a multi-head attention mechanism is introduced to map the input features into three features,and a new feature is obtained through partial point multiplication weighted mapping.Then,the derivation formula of the Soft Max layer is linearized and expanded,and the expansion formula is partially normalized,so that the linear expansion can simulate the original Soft Max mapping function.Finally,an improved linear formulation is used instead of the Soft Max layer for multi-head attention.Finally,a simulation environment based on multi-agent particles is constructed,and the deep reinforcement learning algorithm IAAC based on the improved Adam optimizer for proximal policy optimization and the multi-agent coverage control task model LSA-MAL based on the improved Soft Max layer multi-head attention mechanism are tested.Comparative experiments and analysis.The experimental results demonstrate the effectiveness of the improved algorithm and model,which can improve the training speed and convergence of the agent while maintaining the success rate.

Keywords/Search Tags:

multi-agent system, coverage control, proximal policy optimization, attention-mechanism, actor-critic algorithm

PDF Full Text Request

Related items

1	Multi-agent Cooperative Algorithm Research Based On Proximal Policy Optimization
2	Research On Multi-Agent Collaboration Based On Value Decomposition And Proximal Policy Optimization
3	Robust Policy Gadient Algorithm Based On Actor-Critic In Deep Reinforcement Learning
4	Exdloratory Action Correction Algorithm Based On Actor-Critic
5	Research On Computation Offloading And Resource Allocation Policy Optimization In Multi-user And Multi-server Edge Computing Environment
6	Research On Actor-Critic Framework Based Mean-Field Control Algorithm
7	Research On Multiagent Cooperation And Applications Based On Reinforcement Learning
8	Research On Target Tracking Via Multiple-feature Based On Deep Reinforcement Learning
9	Speech-driven Animation Based On Actor-Critic Method
10	Research On Multi-agent System Decision Algorithm Based On Deep Reinforcement Learning