| The ongoing evolution of deep learning technology has led to the rapid development in speech recognition with deep neural networks as the computational core.Smart device providers was impressed on the high accuracy and convenience of speech recognition,and various smart terminals begin to be widely equipped with voice input interfaces.However,previous studies have shown that speech recognition systems based on deep learning are vulnerable to maliciously added small adversarial perturbation.The audio generated by such deliberately produced perturbation added to the original speech is referred to as voice adversarial examples,which can stealthily force the transcription into malicious output.In order to improve the security of the speech recognition system,it is important to study the security of adversarial examples for automatic speech recognition system.In this paper,based on the experiments we found the previously proposed voice adversarial examples to be insufficiently robust to the transformation generated during speech resampling,and even completely useless,i.e.unable to mislead speech recognition model.Therefore,this paper,based on expectation over transformation framework,designs a robust voice adversarial example generating method against the most common transformation during the speech resampling process,i.e.the time offset and to adding noises the original audio.On the other side,the target deep neural network of the previous method is the recurrent neural network,while recently the automatic speech recognition system with the convolutional neural network has increased in popularity.Therefore,in this paper,our robust voice adversarial examples are also tested to attack the speech recognition system based on convolutional neural network to prove the effectiveness of the proposed method.Finally,for the further study the security of voice adversarial example,this paper builds a voice adversarial example security testing platform.The platform integrates the speech recognition model based on both recurrent and convolutional neural network.Based on this platform,this paper sets up a series of experiments to explore the effects of different parameters on the robustness,signal-to-noise ratio and time consume of generating the adversarial examples.The Experiments show that the robust voice adversarial example generating method proposed in this paper can effectively attack the speech recognition system based on both neural networks,and the adversarial examples can be robust to noises and time offset at the same time.Based on the test platform this paper sets up a comprehensive experimental exploration of voice adversary examples with some conclusions. |