| In recent years,Deep Reinforcement Learning(DRL)has emerged as a new technology in the field of Artificial Intelligence that combines the feature extraction ability of Deep Neural Networks(DNNs)with the sequential decision-making ability of Reinforcement Learning(RL)to solve many important problems and achieve significant results.However,the black-box nature of DNNs used as policy outputs in DRL poses many security issues in practical applications,including adversarial attacks,model stealing,and backdoor attacks.Especially when users need to outsource tasks to third parties or use third-party pre-trained models,attackers can modify a small amount of training data to influence specific parameters of the model and implant a backdoor,which may cause the victim model to exhibit malicious behavior when triggered.As attackers only need to influence a small number of model parameters to implant the backdoor,backdoor attacks are difficult to detect for large-scale models.Moreover,the complexity of the model structure further increases the difficulty of backdoor defense.Therefore,this thesis focuses on the security issues of backdoor attacks and defense in DRL,which is of great research value and significance for building trustworthy DRL systems.However,existing research on backdoor attacks in DRL still has significant shortcomings and limitations in terms of attack efficiency,invisibility and attack scenarios,while research on backdoor defense in this field is very limited and the performance is flawed.To address these issues,this thesis conducts a deep study on the backdoor attack and defense issues in DRL and proposes corresponding attack and defense schemes:(1)A backdoor attack scheme based on DRL.This scheme implants multiple triggers in an adaptive manner during the model training phase,which greatly reduces the backdoor implantation cost and the impact of backdoor data on the model performance compared to current mainstream schemes.During the testing phase,the scheme detects key moments to select appropriate states and trigger attacks on the victim model,which significantly reduces the attacker’s backdoor trigger cost,improves attack efficiency,and invisibility compared to current mainstream schemes.(2)A backdoor defense scheme for deep reinforcement learning.This scheme utilizes the generative power of Generative Adversarial Networks(GAN)to reverse-engineer the backdoor triggers present in the model and eliminate the model’s backdoors through behavior cloning.This scheme is the first defense method proposed for DRL backdoor attacks in the image domain.Experimental results show that,without affecting model performance,this scheme can successfully eliminate existing backdoors in the model. |