| The rapid development of the Internet has brought great convenience to people’s life and work,driven the development of social economy,and the problem of Internet security has also followed.Malicious code has always been the top issue of Internet security.Traditional malicious code detection methods only analyze the static behavior of malicious code.Whether we use code decompilation or machine learning for feature extraction,we are only testing the characteristics of the code file itself.With the continuous development of technology,malicious code makers will use polymorphism and other technologies to encapsulate code files.The traditional methods not only consume a lot of manpower in code file analysis,but also the detection effect is not satisfactory.Aiming at the shortcomings of the above traditional methods in malicious detection and the problem of malicious code family classification,this paper proposes a malicious code dynamic behavior detection technology based on CNN(revolutionary neural networks).The dynamic behavior report file is obtained by running the malicious code file in the real operating system environment.The dynamic behavior report file adopts the word vector model to convert the text information into vector information.The CNN model adopts the multi convolution kernel to fully extract and train the dynamic behavior vector information,Finally,it effectively solves the problem of malicious code detection and family classification.Research on dynamic behavior detection technology of malicious code based on CNN,including construction of cuckoo sandbox environment,processing of dynamic behavior report word vector model,construction and optimization of CNN model,secondary classification and multi classification of malicious code.The main contents of this paper are as follows:1)According to the dynamic behavior analysis of malicious code,the malicious PE file is run in win7 airliner.The paper uses Ubuntu 16 04 virtual system as the host and win7 system installed in VirtualBox as the passenger plane.Run the malicious code in the win7 passenger plane of VirtualBox,and get the behavior report file corresponding to each malicious code through the actual operation of the malicious code.2)Aiming at the problem of redundant behavior report TXT file information corresponding to malicious code,a dynamic behavior analysis method based on API call is proposed.On the basis of extracting the category and API field information in TXT file by Python script program,the order word vector model is used to convert the text information into vector information.After obtaining the dynamic behavior vector information of malicious code,considering the insufficient feature extraction in the traditional way,CNN model adopts the way of multi convolution kernel to further extract the relevant information of the input layer.Compared with the traditional malicious code detection technology,CNN dynamic behavior detection technology can well solve the problems of new malicious code detection and family classification.3)Based on the completion of the above CNN,the CNN model is further optimized.The comparison experiment between CNN model and classification algorithm in machine learning shows that CNN is better than other classification algorithms in malicious code classification.In the experiment of classification and detection of seven categories of malicious code in the experimental data,the detection rate reaches 90%.Compared with the classification of common anti-virus software,the detection rate of CNN model is 94%.The experimental stage shows that the CNN detection model proposed in this paper can effectively solve the problems of malicious code detection and family classification. |