| With the popularization of the Internet and the widespread use of the mobile Internet,the spread of malicious software has increased and the number of malicious software has grown rapidly.Malware authors use techniques such as obfuscation and distortion to generate new types of malware to evade the detection of traditional detection methods.This poses challenges to traditional methods of malware analysis and detection and poses significant threats to computer system security.The classification of malware with similar behavior or characteristics into a malware family has an important role in the analysis and detection of malware.Using machine learning techniques,especially deep learning techniques,to quickly and accurately identify malware is a research trend.This thesis aims to study the classification algorithm of malware based on convolutional neural network.Firstly,it transforms the malware into grayscale image.At the same time,it fully utilizes the advantages of convolutional neural network in image processing field,established a comprehensive model of detection and classification.The comprehensive model avoids the complex preprocess of feature extraction of images in the traditional classification models,which greatly shortens the calculation time.Through experimental data,it is found that the model proposed in this paper significantly improves the recognition and classification accuracy of malware,compared with the traditional models.The main research contents include:(1)Analyze the current status of malware and security threats,and summarize the research results and methods commonly used in malware detection at this stage.(2)Using the frequency characteristics of opcodes extracted from the disassembled file as an input set of the traditional machine learning classification algorithm to improve the classification accuracy of traditional machine learning.(3)Convert executable binary software to binary stream format to hexadecimal numeric byte stream format,and convert the hexadecimal numeric byte stream format to grayscale image to facilitate the use of convolutional neural network.(4)Using the advantages of convolutional neural networks in image processing,it can avoid the complex pre-processing of images and can directly input the original grayscale image.The aim is to improve the recognition efficiency of grayscale images and improve the classification accuracy of malware.Through a large number of experiments,the method based on convolutional neural network proposed in this thesis can achieve a classification accuracy of 99.3% or more in malware classification and achieve a better security classification result. |