Font Size: a A A

Malware Classification Technology Research And System Implementation

Posted on:2023-11-11Degree:MasterType:Thesis
Country:ChinaCandidate:P C XuFull Text:PDF
GTID:2558307061451194Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Malware classification has always been an important research direction in the field of cyber security.According to the different dimensions of classification,it can be divided into functional classification and family classification.In the research on functional classification of malware,static analysis technology is easily interfered by obfuscation methods.Therefore,related research mainly uses dynamic analysis technology to extract behavior information and has made certain progress.But there still remain some problems that semantic features of behavior are not fully explored and similarity of behavior is not effectively represented.In the family research of malware,with the development of graph neural network technology,researchers have begun to analyze the family relationship by using the graph that can express the execution logic of malware.However,the current research mainly focuses on the performance improvement of graph neural network and does not fully consider the structural characteristics of the malware itself,the rich semantic information in the original assembly code is lost.In response to the above problems,this paper conducts an in-depth study on the classification of malware.The main research results of this paper are as follows:(1)A functional classification method of malware based on API(application programming interface function)call sequence is proposed.The API call sequence during the execution of malware is extracted by dynamic analysis technology as behavior information,and the feature information of the sequence is extracted by the idea of text classification.To express the similarity and relatedness between APIs,the Word2 Vec model is used to map the APIs into a vector space.In order to fully mine the behavior characteristics of malware,a two-channel feature extraction model is designed.CNN(Convolutional Neural Networks)is used to extract the local contextual relationship in the API sequence,Bi LSTM(Bi-directional Long Short-Term Memory)is used to extract the global timing relationship,and the two models are combined in parallel to fuse the two types of feature probabilities.After several sets of comparative experiments,the effectiveness of the scheme design was verified,and the final classification accuracy rate reached 98.02%.(2)A classification method of malware family based on control flow graph is proposed.The control flow graph extracted from the malware is used to represent the internal association of the malware family,and the feature vector of the whole graph is generated by using the graph embedding technique.In order to fully extract the semantic information of the assembly code in the basic blocks,the sentence vector model is trained to learn the semantic features of the instructions,and the node attributes are calculated by combining the syntactic features extracted manually.Considering that the directionality of edges in the control flow graph represents the characteristics of execution logic,a graph convolution propagation rule for directed graphs is designed.In order to distinguish the importance of attribute information of graph nodes,a pooling layer based on attention mechanism is introduced,which assigns different attention weights to nodes,and aggregates node attributes to calculate graph embeddings according to the weights.The results of the control experiments have verified that each module in the scheme can improve the classification results to a certain extent,and the best classification accuracy rate has reached 97.87%.(3)A malware analysis prototype system is designed and implemented.Based on the two malware classification methods,an automatic malware analysis platform is built.Through a visual interactive interface,the analysis results of malware and sample information extracted during the analysis process are displayed.All in all,this paper makes an in-depth exploration of the classification technology of malware.Through a series of experimental test results,it is shown that the related method proposed in this paper can effectively identify the function type and family type of malware,and can provide effective automatic analysis of malware.The solution can be further applied in many fields such as cyber security construction,emergency response and attack organization traceability.
Keywords/Search Tags:Malware, Feature extraction, Neural network, Attention mechanism, Cyber security
PDF Full Text Request
Related items