| In today’s increasingly developed technology,artificial intelligence has occupied an important part of human life,replacing human work in many fields,or playing an important role in many key positions.This is a necessity for the rapid development of technology,but it also lays hidden dangers for the safety of human society.Adversarial sample attacks are a security threat that cannot be ignored in the field of artificial intelligence.By applying imperceptible small perturbations to normal input samples through adversarial attack algorithms,the artificial intelligence system classifies the adversarial samples into categories completely different from the original category with high confidence,resulting in errors in the work performed by the artificial intelligence system and even paralysis of the artificial intelligence system,Seriously affecting the normal pace of human life,and even causing major personal and property safety accidents.In order to solve the problem caused by adversarial samples in artificial intelligence systems,this paper proposes a new neural network that combines the advantages of capsule network and Transformer network,and explores its detection ability for adversarial samples of different modalities on the new neural network.The main work and innovative points of this article are as follows:(1)The reconstruction network of capsule networks has been proven to utilize the reconstruction loss between the original and reconstructed samples as an effective mechanism for detecting image adversarial samples.However,the decision criteria used are global thresholds,which can amplify the detection error.Therefore,a sample centered category loss mean detector has been proposed,which centers the image and sets corresponding decision thresholds for each category,Improved the ability of capsule network to detect adversarial samples.(2)In response to the problems of long training time and large number of parameters caused by the internal structure of the capsule network,this paper proposes an improvement of the encoder structure combined with the Transformer network.By extracting input sample features and focusing on the advantages of attention mechanism to internal correlation features of the data,the capsule data structure of the capsule network is improved to integrate multi-dimensional features of input samples,The aim is to preserve the capsule data structure of the capsule network,which contains multidimensional features while reducing the high computational complexity of the network during training.In order to reconstruct the network and learn more critical and rich features for detecting adversarial sample attacks,experiments have shown that the performance of the capsule network is better than that of the original capsule network on the Cifar-10 pixel complex dataset.(3)In order to verify the detection ability of the optimized capsule network for audio adversarial samples,an audio adversarial sample attack algorithm based on the echo state network is proposed,which adopts a multi-step prediction and one-step approach.The first 10 data values are used to predict the last 1 data value,and the newly predicted data value is used as a reference value for predicting the next 1 data value.The audio data was converted from waveform to speech spectrogram through signal processing,and the input-output structure of the optimized capsule network was improved.The experiment successfully verified the excellent detection ability of the network to generate audio adversarial samples against echo state network adversarial attacks. |