| With the continuous development of network technology,cyber attackers are constantly iterating new attack techniques.Advanced Persistent Threat(APT)attacks,as a relatively new method in the field of network security,have rapidly emerged as a significant threat to network security.APT attacks not only threaten national security and stability,but also bring great security risks to the vital interests of society and individuals.Therefore,the importance of protecting the security of cyberspace becomes increasingly prominent.However,in the actual scenario,APT attacks are highly covert,which makes it difficult for the security system to timely prevent and discover before and during the attack.Therefore,it is the focus of security work to do a good job in tracing and analyzing after the event.This paper mainly focuses on researching the organizational origin and technical representation of malware samples,and it aims to accomplish the following tasks:(1)The first task is to propose a malicious code analysis platform that is oriented towards multidimensional characteristics.Based on VMware ESXi framework,this paper builds a software sample analysis platform based on virtualization.By combining various tools and self-developed scripts on the platform,multi-dimensional extraction of two-dimensional,four-categorie and five-type features of software samples can be realized.Moreover,a number of samples were collected,including benign software and malicious software,to form a sample set.By building a good software sample analysis platform,feature extraction was carried out on the sample set.(2)The second task is to research a software organization traceability method that is oriented towards multi-dimensional characteristics.Mainly through the analysis of the collected multi-dimensional features,on the basis of the implementation of APT organization traceability tasks for different feature matching models,the integrated learning framework is used to further improve the classification effect,and the features that can be well described for the software samples are found according to the experimental results.In the single feature software organization traceability method,each feature is matched with a corresponding model,and its effectiveness is evaluated through experimental analysis.Two model fusion methods,feature fusion method and ensemble learning method,are discussed in the multi-class feature software organization traceability method,and their effectiveness is evaluated and analyzed through experiments.(3)Research on software technical representation methods based on TTP(Tactics Techniques Procedures,Tactics,Techniques and process)labels.TTP labels in ATT&CK matrix are used to correspond to the attack technology of the sample,so as to reflect the threat behavior and action of the attacker.In this paper,the model of software organization traceability method is modified.Based on the well described features of software samples,the corresponding algorithm model is used to predict the TTP label of software samples,and the multi-dimensional features are proposed to construct an end-to-end TTP label prediction model.Finally,the effectiveness of the end-to-end model in TTP label prediction task is evaluated and analyzed through several experiments. |