Font Size: a A A

Research On Dynamics Of Multi-Agent Reinforcement Learning Based On Repeated Games

Posted on:2023-08-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2558306620455154Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the widespread application of swarm intelligence,multi-agent reinforcement learning has become an important research topic.In multi-agent reinforcement learning,agents learn optimal policy through the feedback mechanism of reinforcement learning.The dynamic interactions between large-scale agents and the non-deterministic environment lead to the complexity of multi-agent reinforcement learning.During the learning process,it is difficult to predict the optimal behavioral decision of the agent.Establishing a dynamic model for a multi-agent system can provide qualitative insight into the evolution of the system and help understand the decision-making of agents.Due to the explosion of computational space dimensions and the complexity of dynamic interactions brought about by the number of agents,current research mainly uses the mean field method to approximate the interaction of agents,approximating the average strategy of all agents in the system as the opponent’s strategy.However,in practical application scenarios,this approximation is not necessarily effective,and there are different degrees of gap between the overall average strategy and the opponent’s strategy.In order to describe the evolution of agents more accurately,this thesis combines game theory and complex networks to study the local interaction and connection topology of agents,and proposes two multi-agent reinforcement learning models:(1)propose a multi-agent Qlearning model based on repeated game.In a multi-agent system containing a large number of agents,the agents use symmetric canonical games as the interaction method and use the Boltzmann exploration of Q-Learning to learn the optimal policy.By deriving the variation of the individual Q-value and the change of the population Q-value density,a dynamic equation system is established to capture the learning process of the agents.Through agent simulation verification,under different games and different parameter settings,the dynamic model of this thesis can accurately describe the evolution process of agent behavior.In the case of setting different initial strategies,the agent can converge to the Nash equilibrium strategy through learning,which verifies the stability of the model.(2)propose a multi-agent reinforcement learning game model based on random graph.In this model,the connection topology between agents is represented by a graph,and the agents learn policies through the Q-learning algorithm.The accuracy and generalization ability of the model are verified by theoretical derivation,and the Q-learning dynamics predicted by the model in this thesis always match the actual simulation-based results under different connection topologies.The experiment mainly studies the influence of regular network and irregular network on agent learning.On the regular network,the degree of all agents is equal to k,and the smaller k is,the easier it is for the agents to cooperate in the prisoner’s dilemma game.In irregular network,similar results are also produced,the smaller the average degree of the entire network,the easier it is for the agents to cooperate.
Keywords/Search Tags:Multi-agent, Reinforcement learning, Game theory, Dynamic Modeling
PDF Full Text Request
Related items