| As the Internet is widely used in all walks of life,network intrusion events are rampant rapidly.As an important facility of the Internet,DNS attacks become more and more diversified.Many machine learning and deep learning technologies are applied to DNS attack detection.Due to the diversity of DNS attacks,the characteristics of different types of malicious DNS traffic are different.Most machine learning detection methods cannot effectively detect all DNS attacks.Therefore,this thesis proposes a detection model of feature extraction based on deep learning.In addition,the real network DNS traffic often changes in a scene.The emergence of many known and unknown DNS attack leads to changes in the DNS flow characteristic,which produces concept drift.The traditional static model cannot adapt to changing conditions in dynamic data and fails to provide good real-time detection results.Therefore,this thesis proposes an adaptive malicious DNS online detection model for concept drift.Firstly,this thesis proposes a hybrid model ITransformer_CNN based on convolutional neural network and improved Transformer.After analyzing data packets with Zeek,feature extraction is carried out on Transformer encoder structure of domain name itself,and the sequence relation between domain name characters is paid more attention to through multi-head attention mechanism.The positional relation and word vector are directly input into Transformer by one-hot encoding.The other traffic information is preprocessed and sent to the convolutional neural network for extraction.The features extracted from the two models are fused into a feature matrix.Secondly,an online ensemble model of adaptive concept drift is proposed.First an adaptive algorithm of concept drift is put forward in order to make the model more quickly to adapt to changes in the data.With the method of cycle to create a new classifier,train the latest data,and construct integration model,the proposed method replaces the worst classifier with a new classifier component when reaching ensemble nember.To ensure the real-time performance of the diversity of components and ensemble model,dynamic abstained probability integration method is put forward.Based on each base classifier,the determinism displayed by the new instance is abstained,and the classifier pool to participate in the final decision is determined.Then dynamic weight can be assigned to it according to the real-time performance of the base classifier.Finally,the final prediction result is obtained according to the real-time prediction probability and dynamic weight value of each base classifier.Finally,experiments are carried out to evaluate the two models proposed in this thesis and the dataset used is CIC-Bell-DNS 2021 Dataset.The ITransformer_CNN model is firstly compared with the detection effects of the two positional encoding methods.Other deep learning and machine learning models are selected for comparative experiments.The proposed models are also evaluated on other datasets.The experimental results show that the ITransformer_CNN model has a good detection effect.The online ensemble model is compared with the single model and other online ensemble models.The results show that the online ensemble model can adapt to the concept drift in DNS traffic and has strong adaptability and ensemble ability. |