| With the rapid development of Internet of Things technology,smart speakers,as an auxiliary Internet tool,have become an indispensable part of People’s Daily life.In recent years,the security of smart speakers has gradually become a research hotspot.Although most smart speakers on the market encrypt network traffic,an attacker can still infer what a user is accessing by analyzing the length,direction and time of traffic,such as the information a user asks the smart speaker or the music it plays.In the face of encrypted traffic attacks that steal user privacy,defensive measures based on adding false traffic to real traffic to confuse the attacker are also proposed.From the perspectives of attacker and defender,this paper deeply studies the shortcomings of existing attack and defense methods,and gives corresponding solutions:Aiming at the problem that traditional encryption traffic attacks based on deep learning frequently collect a large amount of training data to ensure the effectiveness of attacks,this paper proposes a small sample attack method.First,a new attack model is proposed: an improved convolutional neural network with extended convolution and causal convolution is used to deal with the direction and size characteristics of encrypted traffic,and a time convolutional network is used to deal with the time domain characteristics of encrypted traffic,and then two member learners are combined by ensemble learning.In the second step,an intuitive data enhancement method is proposed,which generates additional virtual samples from the training sample neighborhood based on the refilling of the tracking traffic pattern,so that the attack model can achieve higher attack accuracy under the condition of fewer samples.Aiming at the problem that the previous traffic attack defense methods cannot balance the defense effect and defense cost,this paper proposes a low-delay,lightweight and easy to deploy defense scheme,which includes randomly filling false traffic and grouping delayed real traffic.Filling fake traffic determines the time and quantity of adding virtual traffic in a highly random way.In addition,the real traffic packet will be randomly delayed,and the adaptive random delay algorithm will adjust its time distribution on the time line.These measures can ensure that different traces of encrypted traffic from the same smart speaker look different from each other in terms of overall length,packet order,and packet direction,thus rendering deep learning attacks ineffective. |