Research On Antagonistic Strategies Based On Deep Reinforcement Learning

Posted on:2022-01-01

Degree:Master

Type:Thesis

Country:China

Candidate:Y T Lei

Full Text:PDF

GTID:2558307154476064

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

At present,more and more people are engaged in the research of deep reinforcement learning algorithm,and its mainstream application is in the field of games.Reinforcement learning has conquered chess belonging to complete information game,and Texas poker belonging to incomplete information games.And it reached or even surpassed the highest player level of mankind in E-sports games with huge state space and complex action space.However,reinforcement learning algorithm still has great challenges in fields such as automatic driving.The main reason is that the training of reinforcement learning needs to build an environment for interacting with agents.However,it is very difficult to construct realistic simulation scenes,and there is no guarantee that we will not encounter the state that the agent has not seen.Therefore,it is necessary to explore the simulation scene first.Based on this,this paper mainly studies reinforcement learning in simulation scenario.Firstly,aiming at the single agent game confrontation scenario,this paper models both sides of the game.Based on the proximate policy optimization,the network structure is improved by using the gated recurrent unit.Through targeted reward design and structure improvement,a game confrontation decision algorithm is proposed.The algorithm can effectively deal with the decision-making problem in complex scenes with sparse real rewards,and effectively improve the success rate of the game algorithm.The results show that in the algorithm using the deviation angle as the reward function,the success rate of the attacker to win the game can exceed 95% after 1000 rounds of training.Then,aiming at the heterogeneous multi-agent game confrontation scenario,this paper proposes a game confrontation decision algorithm based on multi-agent deep deterministic policy gradient.The algorithm combines long short term memory and actor-critic,which not only realizes the convergence of the algorithm in huge state space and action space,but also solves the problem of sparse real rewards.At the same time,imitation learning is integrated into the decision algorithm,which not only improves the convergence speed of the algorithm,but also greatly improves the effectiveness of the algorithm.The results show that the algorithm can deal with a variety of different tactical scenarios,make flexible decisions according to the changes of the enemy,and the average winning rate is close to 90%.

Keywords/Search Tags:

Deep reinforcement learning, Game confrontation, Simulation environment, Heterogeneous Multi-Agent

PDF Full Text Request

Related items

1	Research On Multi-agent Cooperative Confrontation Method Based On Deep Reinforcement Learning
2	Reinforcement Learning Technology Optimization For Heterogeneous Multi-agent Game Confrontation
3	Research On Cooperative Confrontation Of Multiple Agents Based On Deep Reinforcement Learning
4	Research On Multi-Agent Pursuit-Evasion Based On Deep Reinforcement Learning
5	Research On Multi-agent Confrontation Algorithm Based On Deep Reinforcement Learning
6	Research On Group Confrontation Strategies Based On Deep Reinforcement Learning
7	Research On Deep Reinforcement Learning Method For Environment With Non-stationary Dynamics
8	Collaborative Confrontation Algorithm Based On Deep Reinforcement Learning
9	Research On Multi-Agent Combat Based On Value Decomposition Deep Reinforcement Learning
10	Research On Multi-agent Confrontation Strategy Based On Deep Reinforcement Learning