Font Size: a A A

Research On Android Malware Detection Based On Image And Graphic Features Representation Learning

Posted on:2024-01-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:J Y FengFull Text:PDF
GTID:1528307337965799Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The rapid and extensive development of malware and its variants on mobile devices,driven by the widespread adoption of mobile networks,poses significant threats to individuals’ privacy and property security.In recent years,researchers have used representation learning-based methods to detect Android malware on a large scale and achieved certain results.However,there are a number of issues that need to be addressed in this field.The detection effectiveness significantly varies depending on the features representation format,level of feature coverage,and data labeling rate.And in more finegranularity malware family classification,more sophisticated detection models and increased training time are necessitated by the partial resemblance in behavioral patterns among families within the same malware category.In view of the above problems,the main research content of this paper is as follows.Firstly,aiming at the issue of low accuracy and precision of existing Android malware detections,network traffic features are converted into grayscale images with smaller dimensions and greater representational posibility,and an Android malware detection model Sel Att Conv LSTM based on network traffic grayscale images is proposed.Based on automatic feature selection,the model applies Convolution-LSTM to learn the temporal and spatial features of traffic grayscale images and adding an attention mechanism gives higher attention to features that contribute more to detection.Finally,the feasibility and effectiveness of detecting Android malware using a gray-scale image representation of network traffic and the Sel Att Conv LSTM model have been demonstrated through multiple groups of experiments.Secondly,for the issue of excessive training time overhead of existing Android malware category and family classification methods,a three-layer detection architecture is proposed to detect Android malware and its malware category and family.The first layer of this method extracts static features and converts them into one-hot encoding,which is then input into the DNN model to detect suspicious applications and malicious applications;the second layer adopts the first layer of suspicious applications as input and acquire network traffic converted into gray scale images and input into the proposed CACNN model for malicious or benign software detection;The third layer takes the first and second layer malware as input,obtains those malware network traffic grayscale images,and also applies CACNN to classify malware categories and families.The three-layer framework,which involves convolutional auto-encoding in CACNN,achieves compression of network traffic image data and feature magnificiant.This results in a reduction of the model’s training time overhead.Finally,experimental results demonstrate that this three-layer detection framework effectively minimizes training time and enhances detection efficiency.Thirdly,for the issue that the small amount of data labeling of Android software in real environment affects the training effect of detection model,a novel NFNI-MGATMg for Android malware detection and malware category classification model based on network traffic graph representation learning is proposed.NFNI-MGATMg takes advantage of the fact that GAT can aggregate the information of neighboring nodes to achieve effective malware detection under a small amount of labeled data.The model first constructs a node graph and its corresponding edge-node graph.Then,it applies MGATMg to learn node and edge-node features for network traffic graph-based Android malware detection and category classification.Experiments on Android malware and category detection in datasets with varying labeling ratios verify the performance of this model on datasets with low label rates.Finally,in view of the issue that insufficient coverage of single feature software behavior leads to reduced accuracy of deep learning models,a multi-feature fusion Android malware detection and category classification method Hybrid Detecor is proposed.This method can more accurately represent software behavior by fusing multiple features,it first extracts the Android software function call graph and constructs the network behavior function call graph through pruning optimization;and then applies the network traffic of the Android software to construct the node interaction graph and edge-node graph.Then those two type of features are converted into a vector representation employing Node2vec;Finally,the accuracy of malware and category detection using network traffic alone and using fused features is tested by four basic classifiers,which verifies that the detection model fusing multiple features can improve the accuracy of malware detection,and the stability of the model can still be maintained by utilizing complementary features when a single feature is not sufficient to represent the software behavior.
Keywords/Search Tags:Android, malware detection, network traffic, multi-feature fusion, representation Learning
PDF Full Text Request
Related items