| Thanks to the development of Internet technology and network economy,various channel information systems are widely used.In order to obtain huge economic benefits,many lawless elements began to frequently invade information systems,resulting in an increase in the number of malicious software,which brought huge economic losses to individuals and enterprises.Compared with computer system protection tools such as Master Lu and computer housekeeper,malware detection system has become the main means to maintain network security.However,under the background of big data,the existing malware detection methods still have some shortcomings,such as high computational cost,long detection time and low detection accuracy,which are not enough to effectively detect malware.With the development of advanced technologies such as artificial intelligence and 5G,deep learning technology has improved the generalization ability and the ability to extract features independently to a certain extent.Based on the above background,this paper uses deep learning algorithm to realize the research of malware attack detection.The main contents are as follows:Firstly,the purpose and significance of studying malware detection are introduced,and the methods proposed by scholars at home and abroad for malware detection are summarized.It is found that the current malware detection technologies include behavior-based,feature-based and heuristic methods,while most detection methods rely on the static features of malware and lack the dynamic feature information.Dynamic feature detection method can’t guarantee that all malware codes can be executed when malware is called,and it lacks time and space information.The feature extraction of malware is too subjective.Secondly,it introduces some related concepts and theories such as malware and machine learning.Finally,Alibaba Cloud security malware detection data set is selected,and the malware is classified by CNN-LSTM model.In this model,the API calling sequence and other information generated by the dynamic analysis of malware are taken as the main information,and the API calling sequence under the same thread is spliced to form a new data set.Combining Word2 Vec with CNN to extract new data set features,in which Word2 Vec model refines API document information,and CNN layer performs API feature dimensionality reduction;Then bring it into LSTM layer for model training,and finally predict and judge whether it is malware.By comparing the mean square error,accuracy,recall,F1-score and loss value of CNN-LSTM,XGBoost and Text CNN models,it is found that the prediction accuracy of CNN-LSTM model constructed in this paper is the highest,with an accuracy of 93.44% and a recall of 92.55%,which proves that CNN-LSTM model has a good performance in the field of malware detection. |