| With the development of network and information technology,malicious software attacks have caused a continuous increase in security problems and become a major hidden danger of network security.Malware uses anti-tracking,shelling or deformation technology to make it survive for a long time and is not easy to be detected and found.Therefore,malware attacks tend to increase exponentially.Many computer users,companies,and governments have been greatly affected,and malware detection is still a hot research topic.Among them,how to detect malicious software accurately and quickly has always received greater attention.This paper proposes a neural network model that uses a combination of convolutional neural network(CNN)and recurrent neural network(RNN)to detect and classify malware.Malware can use deformation technology to make dynamic code changes and generate different malware variants,so it is difficult to detect and classify.However,malware variants have an unchanging feature at each stage,that is,they need to use some API services provided by the operating system to achieve their harmful behavior.This article takes advantage of this point and uses the API function call sequence as the feature of neural network model learning to detect and classify malware.By analyzing the API calls made by the application to the system,it can be judged whether the behavior of the software is harmful,and the deep learning model can also dig deeper into the calling sequence of API functions to further determine whether it is malware.The LSTM model,a variant of the cyclic neural network,has a gating unit.The output is not only related to the current input,but also related to the output of the previous sequence.It has great advantages in dealing with complex sequence problems.Therefore,this article analyzes the API function calls of five major malicious families Ramnit,Lethic,Sality,Emotet,and Ursnif extracted from the Ember data set,and the malicious behaviors that may be caused by the API function calls of windows applications.The short-term memory network is mainly used,combined with the convolutional neural network,and the word embedding framework word2 vec is added,and a neural network model of CNN+LSTM is constructed to detect and classify malware.This thesis will experiment and compare four methods based on embedding-based LSTM model,embedding-based CNN+LSTM model,word2vec-based LSTM model and word2vec-based CNN+LSTM model.The results show that the word2vec-based CNN+LSTM model can effectively detect and classify malware,and the accuracy rate is increased to 98%. |