Font Size: a A A

Research Into The Detection And Classification Of Malware

Posted on:2010-07-10Degree:MasterType:Thesis
Country:ChinaCandidate:H L ZhaoFull Text:PDF
GTID:2178330338475947Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the explosive growth of malware which often use polymorphism and metamorphism technology,the traditional signature-based detection methods could not meet the security requirements.From the perspective of actual anti-virus requirements,this paper proposes an automated malicious code detection and classification methods. The automated malware integrated analysis system(AMIAS) can extract static and dynamic behavior features, then use support vector machine to detect malware. AMIAS system also generate the malware behavior analysis report.We learned the behavior patterns from each malware family in the known malware database and establish a multi-class classifier with SVM for the classification of new detected malicious samples. Our method overcomes the shortcoming of single static or dynamic detection method and could achieve rapid detection of massive malware samples. Malware classification result could provide guidance for the remove of malware.The main contents of this paper focus on four aspects: first, we proposed a definition of static and dynamic behavior feature. By learning known malware static and dynamic behavior information, we defined a 55-dimensional combination feature.Static feature includes a total of 20 features,these static features are extracted from the PE file structure differences between the benign and malicious code.Dynamic behavior analysis has the ability to detect unknown malicious code, therefore behavior features is the main body of the union feature. Based on the extensive research on the Win32 API using of malware,we defined a total of 35 behavior features. Each feature represents a kind of dynamic behavior event, these event all derived from the summarized information with corresponding Win32 API function calls and their parameters.Second, we implement the automation of malicious code integrated analysis system (AMIAS) with the virtual machine control technology. AMIAS system has two functions, one is extracts the value of feature which is correspondingly defined in feature space. The other is to generate an behaviour analysis report of each sample. AMIAS is an automated on-line processing system, which will address the massive malware analysis requirements.Third, we proposed a new malware detection method based on SVM. With the definition of combination feature,we construct SVM model for malware detection. Detection experiment dataset contains 9917 malware and 6591 benign code. According to the different data sets source, we design an initial experiment and create different training set for the training of SVM classifier. With the mathematical statistics of effective feature numbers of error samples in the initial experiment. We improved the initial experiment and the results show that when the threshold number is 6, the ratio of detected and sample utilization are both high. We also designed comparative experiments to verify the effect of joint use with combination feature and SVM model. The results show that joint use detection method perform better. For the gray samples, we have improved the model with the introduction of feature importance quantitative methods, we generate new feature value with product of feature weights and value. Experiments show that improved detection performance better on the gray samples.Fourth, We improved the malware behaviour report classification method and accomplished malware classification task through the report classification indirectly. Based on the malware behaviour unit, we extracted feature words from behaviour report, then define mapping function to map behavior analytical report into vector spatial data, finally train a multi-class SVM classifiers for automatic classification of malware. Comparison with similar methods,experimental results show that our method can effectively improve the accuracy and efficiency of malware classification.
Keywords/Search Tags:Malicious code, Combination feature, PE file parsing, Dynamic behavior analysis, Windows debugger, Windows API, Support vector machine, Weightiness of feature attribute, Malware classification
PDF Full Text Request
Related items