Font Size: a A A

Organization And Classification Method Of Malicious Code Based On Multi-feature And Multi-model Fusion

Posted on:2023-03-27Degree:MasterType:Thesis
Country:ChinaCandidate:S LvFull Text:PDF
GTID:2558306848455524Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technologies,people are exposed to more hidden dangers when enjoying the convenience brought by the Internet amid increasingly complex network environment.Many network attackers infringe users maliciously,resulting in personal privacy disclosure or even the breakdown of the attacked system.And advanced persistent threat(APT),the most lethal network attack,is named for its stronger pertinence,more complicated techniques,longer incubation period and more professional decision-making team compared with traditional network attacks.Most of the organizations initiating APT attacks are top hacker groups tied to countries,armies and enterprises.They mainly carry out network attacks by virtue of malicious codes,which may lead to serious consequences due to the disclosure of national secrets,military secrets and business data.As such,some organization and classification methods targeted on malicious code were proposed at home and abroad,but their classifying effects are not satisfactory due to relatively low accuracy and recall ratios.In short,it is necessary to work out a new method for more effective organization and classification of malicious code so as to solve the existing problems.Given the above problems,a kind of multi-feature and multi-model classification method was proposed in this paper based on StackingCV fusion algorithm.The core content of the method is concluded as follows: 1、assemble instruction characteristics with temporal properties as well as byte histogram characteristics,byte entropy histogram characteristics and file static features without temporal properties were extracted from malicious code files;2 、 The time sequence features in step 1 were put into the bidirectional long and short-term memory neural network for training and learning;3、The features without timing characteristics in Step 1 were put into the LightGBM algorithm for training and learning.4、The prediction results of Steps 2 and 3 were integrated by the multi-layer perceptron for secondary training,and then the algorithm model was established.This method solves the time-consuming and inefficient problems associated with traditional artificial classification.Moreover,with the accuracy,precision,F1-score and recall rate about 3% higher than existing methods,it also improves the inefficient classification caused by a single feature and classification model.Finally,a hummingbird network sandbox detection system was developed in this paper.In addition to the above classification model,two patents,namely network-traffic sandbox detection method based on virtual network environment and the similarity determination method for malicious samples based on Jaccard coefficient were also integrated into this system.The former one is able to collect,respond to and decode the traffic generated by malicious codes,while the latter one can quickly allocate malicious codes to respective similar space according to learning character strings and then generate similar strings,which achieved effective traceability,responses and strikes on APT organization.Besides,interfaces and functions including malicious code uploading,userdefined parameter configuration,the traffic behavior monitoring and homologous analysis of malicious codes as well as analysis report generation are available in the system,which facilitates the configuration management and operation for users.
Keywords/Search Tags:Advanced Persistent Threat (APT), Bidirectional Long Short-Term Memory, StackingCV fusion algorithm, Hummingbird network security sandbox
PDF Full Text Request
Related items