| With the change of the global economic form and the unstable political situation,the global network is facing the security threat of exponential growth,which has caused great losses and even social unrest.Therefore,the global network security protection still has a long way to go,especially the malicious software has brought great losses to manufacturers and users.At present,lots of commercial detection systems generally use static detection methods to detect malware.However,due to the limitations of static detection methods,they are easily bypassed by malware producers.Therefore,dynamic behavior detection came into being.Dynamic analysis detection detects malware by analyzing its behavior data.Compared with static analysis,dynamic analysis is more effective and robust for improved viruses or even new viruses.At present,there are many mature and perfect malware behavior detection technologies based on machine learning,but most of them rely on a single detector.These single detector based malware classifiers often use the public or self collected malware samples on the network as the training set in the training process,which leads to the poor detection effect of this type of classifier for the latest unsolved malware and some existing malware confrontation samples.In addition,the existing methods are easy to be interfered by malicious tempered input.Attackers can cause the target malware detector to identify the sample incorrectly by inserting an application programming interface call sequence into the original malicious sample.However,the existing generation methods of adversarial malware have some defects,such as most of them can not fully extract semantic features,and lack of discussion on the executability of adversarial samples.In order to improve the above problems,further improve the detection rate of the detector for ordinary samples,and the anti-interference and robustness of the detector for counterattack samples,as well as the attack effectiveness and enforceability of counterattack samples,the following work has been done in this paper:(1)This paper proposes a new and efficient adverserial sample generation method,including the generation method of adversarial API call sequence based on semantic awareness.The paper introduces the generative model,substitute model and the adverserial training process,and also includes the generation method of executable malicious adverserial samples.The adverserial API call sequence generation method based on semantic awareness uses the BERT(Bidirectional Encoder Representation from Transformers)based generator model.Compared with the traditional GAN(Generative adversarial networks),this method has a higher awareness of semantic information,and is able to better interpret the relationship between API calls,and has a better attack effect;Executable malicious adverserial sample generation method makes the generated adversarial API call sequence to be an executable file,turning the theory into reality;(2)This paper proposes a software malicious behavior detection method based on hybrid model intelligent classifier,and introduces the three stages of the method,namely initialization stage,training stage and application stage,and also introduces the algorithm model library of base classifier and meta classifier that may be used in this method.Because this classifier has several different types of base classifiers,and uses hierarchical learning training method,it can well avoid the problem that single classifiers are easy to over fit;(3)The above two methods were tested and analyzed respectively.For the adverserial sample generation method,the evaluation is conducted from three aspects:attack effectiveness,irrelevant API overhead and executable malicious adverserial samples.The experimental results show that the adverserial sample generation technology reaches the highest attack effectiveness of 98% against the base classifiers,and relatively lower irrelevant API call overhead,and ensures the executability of the adverserial samples;For the hybrid model intelligent classifier,the detection accuracy of ordinary malicious samples is as high as 97.54%,which reduces the final attack effectiveness against the adverserial sample generation model to 48%,and reduces the convergence speed by 6 times. |