Font Size: a A A

Research On Key Technologies Of Multi-agent Cooperation Problems Based On Reinforcement Learning

Posted on:2024-02-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:P Q ZhaoFull Text:PDF
GTID:1528306944466474Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the world we live in,a large number of individuals can often form a coordinated and orderly state when they act together.For example,flocks of birds and wildebeest during seasonal migration,fish and shrimp swimming in groups in the ocean,wolves hunting in teams,etc.The coordination,autonomy,and self-organization of these clusters of individuals have attracted the research interest of biologists and sociologists.Through the observation of various biological clusters in the natural world,scientists began to explore the relationship between the collective and the individual.In order to facilitate research,Minsky of the Massachusetts Institute of Technology proposed the concept of an agent and standardized the definition of the field.An agent refers to an individual with the ability of environmental perception,self-decision-making.It can be a virtual application(chatbot,game AI,stock trader,etc.),or it can be a real individual in the real world(humans,vehicles,various animals,etc.).The agent will make serialized decisions based on the cognition of the environment to determine its next actions.Many tasks in the real world are completed by multiple agents.For example,the interaction between players in football matches,area scanning in drone arrays,and multi-manipulator coordination and traffic vehicle control.In such complex scenarios,each agent needs to construct a comprehensive strategy that considers both its own situation and the actions of other agents,so that teamwork can be achieved.In response to this situation,experts have proposed multi-agent reinforcement learning to solve serialized decision-making problems.At present,many scholars have conducted in-depth research in this field,but there are still some problems that need to be solved.First,the trust assignment problem in the multiagent learning process makes it difficult to evaluate the contribution of individual agents in the face of team rewards.Second,the problem of environmental cognition in a partially observable environment requires interaction with other agents while facing incomplete environmental information,which makes it difficult to build collaboration.Third,for the collaboration problem in a large-scale environment,when the environment becomes complex,both the policy fitting cost and the communication cost are greatly increased,which ultimately makes it difficult for the algorithm to converge.This thesis focuses on the multi-agent cooperation problem based on reinforcement learning.The specific content is as follows:Firstly,agents are guided to collaborate by contribution-based intrinsic rewards.In reinforcement learning,an agent optimizes a policy based on feedback given by the environment.The environment provides feedback to the agent based on a reward function,which is called the environmental reward.Intrinsic rewards are an intrinsic driving force that is different from environmental rewards.For example,the intrinsic reward based on curiosity will drive the agent to explore more in the environment,and the intrinsic reward based on cooperation or competition will make the agent pay more attention to the game relationship.In this thesis,we first build a model to evaluate the contribution of the joint thesis of agents to the environment.Then counterfactual reasoning is used for trust assignment,assigning the contribution degree of individual agents from the total contribution.Finally,this contribution is combined as intrinsic reward with environmental reward to optimize the policy.When all agents consider contributing to the collective while ensuring individual interests,then the problem of agent collaboration is solved at the global level.Secondly,collaborate at the behavioral level through agent modeling.Currently,many agent modeling methods focus on actions.But instantaneous actions cannot provide effective decision-making information for agents.Therefore,this thesis proposes to abstract the action space on the basis of the action space.The agent’s actions represent shortterm execution goals.Therefore,collaboration at the behavioral level is more meaningful.In this thesis,we use the behavior generation network to obtain the behavior of the agent.At the same time,mutual information based on information theory is used to evaluate the behavior of agents.The evaluation results are used as intrinsic rewards to encourage agents to cooperate with each other at the behavioral level,thereby establishing global agent collaboration.Thirdly,use multi-agent hierarchical reinforcement learning to achieve large-scale agent collaboration.This thesis proposes to use a combination of multi-agent layered reinforcement learning to achieve collaborative relationships in complex scenarios.Facing a complex environment,it is very difficult to construct a global environment state.General reinforcement learning algorithms are still lacking in performance when dealing with high-latitude environmental states.Similarly,when the number of agents is too large,it is difficult to find the optimal joint action.In this case,the idea of divide and conquer is used to simplify the original problem.The environment is divided into regions,and regulation and collaboration are carried out at the regional level.At the same time,the task is divided into upper-level decision-making and lower-level execution,and large-scale complex scene problems are solved through collaboration between the upper and lower layers.In summary,these research contents are devoted to building the collaborative relationship among multi-agents.Each method proposed in this thesis is compared with classical benchmark algorithms in different environments.The experimental results also show the performance advantages of the proposed algorithm.
Keywords/Search Tags:multi-agent systems, reinforcement learning, hierarchical reinforcement learning, agent modeling, agent collaboration
PDF Full Text Request
Related items