Font Size: a A A

Research On Deep Reinforcement Learning In Cooperative Multi-Agent System

Posted on:2020-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:G H WangFull Text:PDF
GTID:2428330590960939Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Single agent deep reinforcement learning(DRL)has recently made great breakthroughs,and many researchers have begun to apply DRL methods to cooperative multi-agents,which have more practical application values.Although deep reinforcement learning can provide general control method for multi-agent systems,the exponential growth of state-action space increases the difficulty of exploration for DRL algorithms,slowing down the learning speed of it.Furthermore,it is an important factor that affecting the overall performance of the system whether agents can coordinate and cooperate effectively.To solve the problems mentioned above,this paper mainly focuses on the research as follows:(1)Aiming at the problem of low learning efficiency of DRL algorithm in large-scale cooperative multi-agent system,a curriculum training method is proposed to train it.By decomposing the target system into a set of subsystems with an increasing number of agents,we force the algorithm first be trained in the less difficult systems and then switches to the more difficult subsystems until it finally converges in the target system.Secondly,in order to combine the IDQN algorithm with the curriculum training method,a multi-head self-attention unit is added to the DQN network to map the variable length state to a fixed-length feature.Besides,a time-prioritized experience replay technique is proposed,finding a compromise between non-stationary problem and data utilization to further improve learning efficiency.Experiments show that the curriculum training method can effectively accelerate the learning speed of DRL in cooperative multi-agent systems.(2)A multi-agent actor-critic algorithm based on coordinated communication policy networks is proposed for solving multi-agent conflicts under perceptual constraints.Next,we study and analyze the average signal communication unit and the GRU communication unit,proposing a novel unit named GRU weighted signal communication unit.The GRU weighted signal communication unit uses the GRU to calculate the importance of each accepted information sending from other agents,andthen weights and sums the information based on the importance,so that the agent can pay attention to the information highly relevant to itself when making decisions.Experimental results report that the improved algorithm based on the coordinated communication policy network effectively improves the overall performance,and the GRU weighted signal communication unit shows better stability when combined with curriculum training method.
Keywords/Search Tags:cooperative multi-agent, deep reinforcement learning, curriculum training, coordinated communication
PDF Full Text Request
Related items