Research On Multi-Agent Reinforcement Learning Method Based On Dueling Transform Structure

Posted on:2023-10-26

Degree:Master

Type:Thesis

Country:China

Candidate:X D Xie

Full Text:PDF

GTID:2558306905990969

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Recent years,with the success of AlphaGo and AlphaStar,multi-agent reinforcement learning,as an important branch of reinforcement learning,has being developed rapidly.Unlike Alpha Star’s macro-task management,most of the current multi-agent reinforcement learning models focus on micro-operation management much more.While cooperative multi-agent reinforcement learning tasks control multiple agents to cooperate to complete a task through the algorithm model.Although there are many excellent algorithms and models in cooperative multi-agent reinforcement learning,it still has much space to for improvement,such as unsatisfactory performance in some complex scenarios and unstable learning process when training multiple agents.In the existing multi-agent hybrid method based on value function decomposition,the structure of model is too simple,which leads to a small range of function clusters that can be represented by the model.Therefor it cannot get better results in some complex environments.In this thesis,the Dueling Transform Network model(referred to as ADTL＿mix)is proposed for cooperative multi-agent reinforcement learning tasks.By introducing the Dueling structure in single-agent into multi-agent and extending it,the mixing network is divided into state-value mixing network and advantage-value mixing network.The state value mixing network mixes the individual state value functions into the joint state value function by using a structure based on an attention mechanism,and the advantage value function mixing network directly uses the hyper parameter mixing structure of the QMIX model for mixing,and finally it can obtain a joint action value function by forward mixing from two angles,and jointly optimize the individual agent network from two perspectives.The purpose of introducing the competitive transformation network is to improve the phenomenon that the size of the action value function is independent on the choice of each agent’s specific action in many cases.As for the instability of the training process,this thesis uses a learning rate attenuation method based on the cumulative reward value to control the learning rate of each time or every certain period of time,it also uses a transformation model based on the corresponding relationship between global observation and local observation to eliminate the difference of local observation.By using these two methods can finally achieve the goal of making the model training process firm.Finally,the whole structure of model is improved on the basis of QMIX model.The stability method used in this thesis can be used not only in the model proposed in this thesis,but also in any multi-agent reinforcement learning method ground on value function.In this thesis,experiments are carried out on The Star Craft Multi-Agent Challenge(referred to as SMAC),and several representative maps in different levels are selected.Finally,by using comparative experiments,it certificated that the dueling transform model proposed in this thesis has a significant improvement in the winning rate and training stability of different level tasks analyzed with previous methods.Finally,the convincingness of the suggested method is confirmed by contrasting the self-comparison experiment with the model which only changes the agent structure to dueling structure.

Keywords/Search Tags:

Multi-agent Reinforcement Learning, Cooperative, Value decomposition, Dueling transform

PDF Full Text Request

Related items

1	Multi-Agent Cooperative Strategy Based On Reinforcement Learning Research And Application
2	A Study Of Multi-agent Reinforcement Learning Based On Weighted Q-value Decomposition
3	Research On Multi-Agent Combat Based On Value Decomposition Deep Reinforcement Learning
4	Research On Key Technologies Of Reinforcement Learning For Cooperative Multi-Agent System
5	Research On AGV Path Planning Based On Cooperative Multi-agent Reinforcement Learnin
6	Research On Deep Reinforcement Learning In Cooperative Multi-Agent System
7	Research On Multi-agent Attack And Defense Countermeasures Based On Deep Reinforcement Learning
8	The Research On Reinforcement Learning Based On Cooperative Multi-agent
9	Research On Multi-Agent Cooperative Algorithm Based On Deep Reinforcement Learning
10	Multi-agent Value Decomposition Method With Importance Weighted Feedback