Multi-Agent Cooperative Strategy Based On Reinforcement Learning Research And Application

Posted on:2024-09-06

Degree:Master

Type:Thesis

Country:China

Candidate:X Z Chen

Full Text:PDF

GTID:2568307061968779

Subject:Electronic information

Abstract/Summary:

PDF Full Text Request

As the domain of artificial intelligence continues to progress,deep reinforcement learning has achieved notable accomplishments in the realm of single agents.Nevertheless,when implemented in multi-agent scenarios,it inevitably encounters a multitude of novel predicaments,including environmental instability,inadequate inter-agent communication efficacy,and the challenge of appropriately allocating rewards.These problems will seriously affect the efficiency of cooperation between agents.Therefore,in multi-agent settings,the ability for agents to work in unison to accomplish designated objectives holds significant practical significance.To solve the above problems based on the reinforcement learning method,this paper conducts research in the multi-agent cooperative environment.The main work is as follows:1.A multi-agent reinforcement learning algorithm based on cyclic neural network is proposed to solve the local observability problem in multi-agent cooperative environment.This algorithm uses bidirectional cyclic neural network to execute Actor network.Through the past environment observation information and agent action information saved in the network,it can increase the information that the agent can refer to when making decision as much as possible,so as to improve the effectiveness of its strategy and reduce the influence brought by local observation.In addition,the differential reward distribution mechanism is added to clarify the contribution degree of each agent to the completion of the task,and encourage the agent to choose more appropriate action output,so that it can train the correct behavior strategy.A comparison experiment is carried out between the simulated cooperative task environment and the passive localization task environment.Experimental results show that the proposed method can improve the performance of the algorithm more effectively when the task environment is complex.2.Aiming at the problem of credit allocation in multi-agent environment,a multi-agent reinforcement learning algorithm based on value decomposition is proposed.In this algorithm,a centralized Critic network of value decomposition is adopted to calculate the strategy gradient,and then the strategy network is updated according to it.By utilizing the Critic network in this configuration,it becomes feasible to evaluate the individual contribution of each agent towards the overall system reward,minimize the impact of dimension explosion,and enhance the training efficiency of the algorithm.To further validate the effectiveness of the proposed approach,a comparative experiment is conducted in a simulated task environment,yielding experimental results that demonstrate its ability to enhance task completion rates and training efficiency.3.Aiming at the problems existing in the current mainstream multi-agent reinforcement learning and training framework "centralized training distributed execution" : in the training stage,training is conducted according to the observation data of all agents to generate strategies,but in the execution stage,each agent can only obtain its local observation,resulting in poor performance of the algorithm.Especially in the collaborative task environment,this problem is more prominent.Therefore,a communication mechanism based on shared experience is proposed.By opening up a certain size of storage space,it can be used as a shared experience pool among multiple agents.In the training and execution stage,the agent is allowed to carry out parallel read and write operations based on explicit communication,so that the agent can infer the overall task environment,and improve the efficiency of cooperation between agents.Finally,the superiority of this method is proved by comparison in the simulation task environment.

Keywords/Search Tags:

multi-agent, Reinforcement learning, Cooperative control, Cyclic neural network, Value function decomposition

PDF Full Text Request

Related items

1	Research On The Reinforcement Learning Method And Its Application
2	Research And Application Of Algorithms For Multi-Agent Cooperative Control
3	Research On Multi-agent Attack And Defense Countermeasures Based On Deep Reinforcement Learning
4	A Study Of Multi-agent Reinforcement Learning Based On Weighted Q-value Decomposition
5	Research On Multi-agent Deep Reinforcement Learning In Non-globally Knowable Environment
6	Research On Classes Of Cooperative Optimal Control Algorithms Of Multi-agent Systems Via Reinforcement Learning
7	Research On Multi-Agent Combat Based On Value Decomposition Deep Reinforcement Learning
8	Research On Multi-Agent Reinforcement Learning Method Based On Dueling Transform Structure
9	Research On Flocking Cooperative Control Algorithm Based On Reinforcement Learning
10	Research On Multi-Agent Cooperative Strategy Learning And Training Under Interference Environment