The Design And Implement Of Malware Behavior Detection System Based On Decision Tree | | Posted on:2012-11-28 | Degree:Master | Type:Thesis | | Country:China | Candidate:T Cheng | Full Text:PDF | | GTID:2218330374953898 | Subject:Computer application technology | | Abstract/Summary: | PDF Full Text Request | | Anti-virus vendors would get millions of suspicious program samples which may include both malware and benign program. It's required anti-virus engineer to find out the real malicious program and extract the virus signature definition in order to update the virus signature database. The purpose of this paper is to develop a automatic malware detection system, which can detect malware from a large quantity of unknown samples efficiently and generate detail malware behaviors reports. Malwares can be provided to anti-virus engineer for further analysis. The basic scheme of paper is to get program run-time API call sequence using API hook technology. And then extract high-level program behaviors according to the context of API call sequence. At training stage, a great many of behaviors data of known samples are used to generate a classification mode based on C5.0 algorithm. Classification model is used to category unknown sample into malware and benign ware simulating the decision logic of anti-virus engineer . Following are the main workload and shining points:1. 110 typical behaviors attributes and 26 file attributes are collected, which detection system will aim at.2. Currently, all behavior detection systems like CWSandbox are all based this assume that API call sequence reflect behaviors and extract API call sequence through API hook as program's behaviors. We share the same opinion that API call sequence reflects behaviors. However we don't regard API call sequence as the behaviors, but extract high-level behaviors from API call sequence according to their context and relations. This kind of behaviors are much meaningful than pure API.In order to extract API call sequence, Detours is used to hook system API. But there is a problem about Detours that each API requires a corresponding hook function, this problem could definitely make the program more complex. After a research of Detours, we propose a improvement of detours hook mechanism that only one hook function can handle all the API hook. No additional code is required whenever we want hook new API. That means hook is derived by data, not code.In order to extract high-level behaviors, we define every API call as Prolog fact and every behavior as Prolog rule. It takes advantage of Prolog logic process capability to extract high-level behavior accurately.3. Our detection system runs in VMWare virtual machine, which can be controlled by Vix technique. Detection system run sample program and extract API call sequence in virtual environment. Vix technique makes API extracting process automatically and efficiently.4. Typical benign program behaviors are involved in classification mode. Comparing with other behaviors detection system, Malware Sandbox get a low FP rate (False Positive) because of typical benign behaviors.Experiments show that Malware Sandbox gets a relative high detection rate and low misdetection rate. High quality detail behaviors reports can be generated for further analysis. However our detection system still has a defect that virus with anti-virtual machine ability can evade detection because of our system running in virtual environment. To prevent virus threat, we have to run suspicious program in virtual environment, so this defect cannot be addressed so far. | | Keywords/Search Tags: | Malware, High-level behaviors, API hook, Classification Mode, Decision tree, Virtual machine | PDF Full Text Request | Related items |
| |
|