Font Size: a A A

Research And Implementation Of Android Malware Detection Based On API Sequences And Permissions

Posted on:2024-08-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2568306941984009Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
In recent years,Android malware has shown a trend of continuous upgrading and rapid development,which has brought serious troubles to users and security researchers.Although there have been methods in the field of Android security to detect malware using deep learning models,the performance of malware detection models has been declining in the face of growing malware threats and Android system version changes;in addition,in the identification of malware families.In fact,in the face of real-world data distribution,it is difficult for existing models to distinguish a few or emerging families,which poses difficulties for security research.Aiming at the anti-aging problem of the model and the long-tail distribution problem of family data,this paper proposes the following three solutions:(1)Aiming at the impact of unknown APIs on model aging resistance and generalization,an API vector representation method based on a subword corpus and a sensitive API determination method based on a permission knowledge base are proposed,which can easily represent and define unknown APIs.To make better use of these information,we use multi-layer residual stacked multi-head attention mechanism to capture important API calls,and adopt Bi-LSTM to model and classify API sequences.In the AndroZoo data set,the accuracy rate is 98.28%,and the accuracy rate in the generalization test and anti-aging test of different data sets is respectively increased by 0.41%and 3.91%compared with the baseline method.(2)A HAN and GAN-based classification method for Android malware families is proposed.We use applications,APIs,and permissions to define heterogeneous graphs,and use the relationship between the three to define meta-paths,and embed application nodes through HAN.We augment minority data with GAN that add conditional regularization terms.Compared with the baseline model,the classification efficiency is improved,and the accuracy rate in the multi-family data classification experiment is increased by 0.78%on average,and the F1 value of each family in the classification experiment results of 30 families is calculated.The experimental results illustrate the method proposed in this paper.Compared with the baseline method,while improving the overall group classification performance of the model,the classification performance of the minority families whose number is less than 3%of the total in the data distribution is also improved.(3)Based on the malware detection model of LSTM and attention mechanism,this paper designs and implements an Android malware analysis and detection system with local analysis and self-updating model according to its method characteristics.Facing the problems of long detection time and poor aging resistance in existing solutions,this paper implements the local feature extraction function and solves the performance bottleneck of traditional online feature extraction.A model update function is also provided to further prevent performance degradation due to model aging.
Keywords/Search Tags:Android malware, vector representation, deep learning, data enhancement
PDF Full Text Request
Related items