Font Size: a A A

Research On Network Security Defense Decision Based On Reinforcement Learning

Posted on:2024-06-04Degree:MasterType:Thesis
Country:ChinaCandidate:Z W XuFull Text:PDF
GTID:2568306941484434Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of network attack and defense technologies,decision-making in network defense has become a hot topic.Network defense strategies need to balance defense effectiveness with their impact on normal system services.Unreasonable defense strategies may result in service interruptions and increased defense costs.Moving target defense is an active defense mechanism that avoids attackers’ detection and sustains attacks through transformations.To ensure the security of the defense side and normal service provision while minimizing the cost of moving target defense strategies,a reinforcement learning-based defense decisionmaking method is proposed.This method generates empirical samples through network attack-defense games,combines system security and service performance metrics to comprehensively score defense effectiveness and costs,and evaluates the utility of defense strategies in the samples as training data.Subsequently,both the attacker and defender sides are modeled using reinforcement learning,providing proactive defense decision guidance to the defense side by learning from experience,enabling the defense side to make strategy adjustments that maximize expected returns.The main work and innovations are as follows:(1)Evaluation model of defense strategy utility based on security and performance indicators.Existing utility functions in attack-defense game models mainly calculate net gains based on the threat level of attack behaviors and defense effectiveness,without considering the security and service performance attributes and states of the defender.To address this limitation,an evaluation model of defense strategy utility based on security and performance indicators is proposed.It evaluates the comprehensive utility of defender’s security and service performance to measure the net gains brought by decisions,which serves as the utility function in the attackdefense game model and provides training data for the learning model.A utility evaluation indicator tree is established for the evaluation objects,analyzing indicator weights and conducting comprehensive analysis of indicator values and weights to calculate scores and quantitatively evaluate strategy utility.The score reflects the quality of defense strategies,considering both defense effectiveness and cost.Experimental results demonstrate that the evaluation method provides a learning basis for guiding defense decision-making.(2)Maximization method of future payoff for defense strategies based on empirical data.To address the challenge of balancing defense effectiveness and cost in network defense strategy adjustments,existing decision-making methods based on game theory derive theoretically optimal decisions through Nash equilibrium,without fully utilizing past experiences.Therefore,a maximization method of future payoff for defense strategies based on empirical data is proposed.It represents the defender in the attackdefense game model as a reinforcement learning agent and constructs a moving target defense strategy based on multiple-factor combination changes.The adjustment of strategies is treated as the action set of the reinforcement learning agent,and the environment reward obtained through evaluation serves as the utility function.Thus,a defense strategy adjustment algorithm based on Q-learning is developed within the attackdefense game model,enabling the defender to make strategy adjustments with the highest expected payoff based on past experiences.Experimental results demonstrate that the reinforcement learning algorithm can guide the defending party in making optimal decisions.(3)Defense strategy parameter adjustment method based on joint state multi-agent reinforcement learning.Considering the influence of attackers’ strategies on the defender’s environment and decision-making in practical attack-defense scenarios,the game and equilibrium in multi-agent reinforcement learning are studied.A defense strategy parameter adjustment method based on joint state multiagent reinforcement learning is proposed,enabling both attackers and defenders to share the environment and make individual decisions.The joint state set affects the defender’s payoff and decision-making.By training,the system reaches a stable state,allowing the defender to make strategy adjustments with the highest expected payoff in any state.The effectiveness of the method is validated through experiments,including:1)verifying the guidance of multi-agent reinforcement learning on moving target defense strategies based on multiple-factor combination changes using a web system;2)validating that the defender,under the guidance of reinforcement learning agents,can adjust defense strategies considering both defense effectiveness and cost to maximize the payoff,and can approach the game equilibrium through training.The experimental results demonstrate that the defense strategy parameter adjustment method based on joint state multi-agent reinforcement learning enables the defending party to make optimal decisions.
Keywords/Search Tags:reinforcement learning, network security defense, optimal defense decision, attack-defense game
PDF Full Text Request
Related items