| As network technology advances,cybersecurity incidents are occurring around the world,such as data leaks,ransomware,and hacking attacks,and the economic losses they cause are growing significantly.While traditional network security defenses focus on identifying and blocking cyberattacks to keep threats out of the intranet,more and more researchers believe that cyberattacks are unavoidable as highly concealed unknown threats emerge and evolve.Penetration testing(Penetration Testing)is a security testing method that evaluates the security of a computer system,network or application by simulating the attacker’s method of attack.The purpose of penetration testing is to discover security vulnerabilities so that organizations can take steps to patch those vulnerabilities and improve their security.Intelligent penetration is the use of artificial intelligence(AI)technology to make the penetration testing process intelligent.The principle is to use AI technology to empower cyber attacks as a way to assess the security of a computer system,network,or application.However,due to the many ways of network attacks and complex network scenarios,the use of AI technology to achieve intelligent penetration needs to solve the problem of attack strategies that cannot be efficiently executed due to the complex and diverse attack methods and the problem of attack decisions of AI models in complex network environments.In order to solve the above problems,this paper proposes and implements an atomic attack framework based on virtual operating environment,which consists of attack atoms constructed by container technology,incorporates multiple attack techniques to realize complex attack behaviors,and decomposes attack tasks and executes them by using cooperative control technology.Secondly,facing the problem of intelligent planning of attack paths for network penetration,an intelligent decision algorithm based on reinforcement learning is proposed to solve the problems of reward sparsity and excessive action space of reinforcement learning models in penetration path planning.The main work and contributions of this paper are as follows:1)Research and implementation of an atomic infiltration framework based on a virtualized environment: the framework sub-assembles the attack program,constructs a generalizable atomic weapon invocation interface and Remote Procedure Call Protocol(RPC)invocation method,constructs an atomic weapon execution environment through virtualization technology,uses coordination control to decompose the infiltration task The attack task is decomposed into smaller attack tasks using coordination control,and the attack tasks are distributed and executed according to the specific attack atoms.2)The penetration decision method for attack path planning is studied and implemented:the penetration test is modeled as a Markovian decision process,action space and state space are set,and reinforcement learning algorithms are introduced for learning.In order to improve the training efficiency of the penetration model,optimization measures such as experience-first replay and competitive network are combined on top of the basic reinforcement learning model to improve the training efficiency and stability of the model.3)Based on the above reinforcement learning model,action selection strategies are proposed,including heuristic attack path tables and prior knowledge-based guided decision algorithms to reduce the possibility of selecting useless actions.Combining the prior knowledge is used to propose a decision guidance mechanism for intelligent models to speed up the training of the model,followed by using the attack path table to record the path of successful attacks on the target network,which is incorporated into action selection,so that the model can be effectively rewarded multiple times,and finally the HD3 QN model algorithm and the HD3 QNdriven penetration testing system are implemented and compared with the traditional DQN(Deep Q Network)algorithm model in different sizes of complex networks,and outperforms other models in each scenario,thus proving its effectiveness. |