Research On Intelligent Confrontation Method Based On Deep Reinforcement Learnin

Posted on:2023-10-01

Degree:Master

Type:Thesis

Country:China

Candidate:J Yin

Full Text:PDF

GTID:2568307070956019

Subject:Control engineering

Abstract/Summary:

PDF Full Text Request

The problem of offensive and defensive confrontation of agents is a classic confrontation problem,ranging from games to decision-making in the battlefield,all of which have shadows of this problem.This thesis takes the offensive task of the combat unit as the research background,and researches the problem of intelligent confrontation.At the same time,we control the real-time decision-making behavior of the agent based on the relevant algorithms of deep reinforcement learning.The research focuses of this thesis are as follows:Model the combat unit of intelligent confrontation,and based on this model,analyze and mathematically define the process of intelligent confrontation.Finally,design and implement an intelligent confrontation system environment.In the dual-agent confrontation scenario,in view of the low utilization rate of reinforcement learning samples in this scenario,a dual-experience pool mechanism is introduced,which separates the successful experience pool and the failure experience pool to improve the learning efficiency of the samples.Aiming at the problem of exploration strategies for different types of actions in this scene,an exploration strategy that mixes OU noise and Gaussian noise is introduced to improve the exploration efficiency of different types of actions.For the sparse reward problem in this scenario,a dense reward function is designed to guide the agent to complete the attack task more efficiently.In the confrontation environment,the correctness and effectiveness of the improved DDPG algorithm are verified by experiments.In the multi-agent confrontation scenario,we have done the following work for the partially observable agents in the process,the instability of the learning environment and the inertia of the agents.We introduce a training framework for centralized training of distributed execution,a bidirectional coordinated neural network architecture,and a reward mechanism that mixes individual rewards and collective rewards.In the confrontation environment,the effectiveness and superiority of the improved DNE-DDPG algorithm compared with other benchmark algorithms are verified by experiments.In the multi-agent confrontation scenario,the idea of hierarchical learning is introduced for the problem of spatial dimension disaster and sparse reward in the process of collaborative decision-making among multiple agents.We divide the process of agent confrontation into a high-level sub-policy selection process and a low-level action execution process.We perform reinforcement learning training on the agent based on the proximal policy gradient optimization algorithm,and training on the selection of high-level sub-policies based on imitation learning.In the confrontation environment,the effectiveness of the hierarchical reinforcement learning method is verified by experiments.

Keywords/Search Tags:

Intelligent confrontation, DNE-DDPG algorithm, Bidirectional Coordination network, Hierarchical Reinforcement Learning

PDF Full Text Request

Related items

1	Research On Multi-agent Confrontation Strategy Based On Deep Reinforcement Learning
2	Research On Multi-agent Confrontation Algorithm Based On Deep Reinforcement Learning
3	Research On Fuzz Testing Technology Based On DDPG Reinforcement Learning Algorithm
4	Aero-engine Intelligent Control Based On Reinforcement Learning
5	Research And Implementation Of Intelligent Decision-Making System For Wargame Based On Reinforcement Learning
6	Research On Improvement Of Motion Coordination Algorithm Based On Reinforcement Learning In Specific Road Network Environment
7	A Study Of Latent Space Hierarchical Algorithm For Distributed Deep Reinforcement Learning
8	Controller Synthesis For Intelligent Systems Based On Meta-Reinforcement Learning
9	Research On Improvement Of Multi-Object Motion Coordination Reinforcement Learning Algorithm In Specific Road Network Environment
10	Mobile Robot Path Planning Based On DDPG Reinforcement Learning Network