Font Size: a A A

Research On The Random Forest Based Detection Of Malicious Mobile Applications At Runtime

Posted on:2016-11-03Degree:MasterType:Thesis
Country:ChinaCandidate:J D QiuFull Text:PDF
GTID:2308330464469343Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the popularity of Android devices, Android malware sustained rapid growth, the malicious programs with APK Protection have exponential growth trend. The protective operations encrypted the DEX file of the malicious programs, so it’s been difficult for ordinary static analysis methods to deal with them. Meanwhile, some malicious programs will exit process immediately when they detect the application running in simulator.To solve these problems, this paper focuses on dynamic analysis, defines the function call sequence as the detection object and uses Xposed framework to hook system call to build dynamic testing environment. After creating events sequence from Manifest file of APK, we first trigger events in the testing environment as an automated input, and collect the function call sequences which produced after the event as an output. And then, we use the idea of Muti-HMM to build model for each type of output event. Finally, we use the Random Forest algorithm to ensemble multiple HMM classifier and static features to evaluate the result. The experiments show that RF-Based-Muti-HMM(RBMH) algorithm has a high hit rate(TPR), accuracy(AC), a low false positive rate(FPR), root mean square error(MSE), the generalization error, stability of a smaller number of samples. The main work of this paper is as follows:(1) Collect the function call sequences. By analyzing the malicious behavior on Android platform, we gave the definition of the event and the function call sequences, used the function call sequences to describe the characteristics of corresponding type of event. We use Xposed framework to hook system call to build dynamic testing environment. After picking up event sequences from Manifest file of APK, we regard the triggered events in the testing environment as an automated input and the collected function call sequences which produced after the event as an output. This paper implemented the method of collecting the function call sequences and verified the validity of data by experiment.(2) Detection algorithm design. We use the idea of Muti-HMM to build model for each type of output event, and then use the Random Forest algorithm to ensemble multiple HMM classifier and static features to evaluate the result.(3) Experiments comparison. We design experiments to compare the single HMM classifier with RBMH classifier. the evaluating indicator is as follows: assessment accuracy, false alarm rate, the root mean square error and other indicators. Design experiments to study the stability of RBMH algorithm by using different number of samples.
Keywords/Search Tags:mobile security, dynamic detection, system call sequence, Hidden Markov model, random forest, ensemble learning
PDF Full Text Request
Related items