Research On Key Technology Of Android Program Behavior Analysis And Control Based On Information Flow

Posted on:2024-09-16

Degree:Master

Type:Thesis

Country:China

Candidate:B S Yang

Full Text:PDF

GTID:2568307100473084

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

The openness of the Android operating system has attracted numerous developers to join the ranks of Android application development,covering various fields from productivity tools to entertainment games,making Android the mobile operating system with the largest market share.Nowadays,mobile devices have influenced every aspect of our lives,becoming the center for personal information and processing.Due to the openness of the Android system and the imperfection of its own security mechanism,malicious behavior of some applications stealing user privacy information frequently occurs,posing a threat to privacy information stored on mobile devices.Information flow technology,aiming to ensure the security of information,has been widely applied in the analysis and control of Android program behavior.However,due to the increasing complexity and diversity of application programs,it has become increasingly difficult to distinguish between benign and malicious application behaviors.The traditional approach of detecting malicious software using single information flow features may produce false positives.For example,both benign and malicious applications may use the same network interface to send out user privacy information.However,benign applications may only collect the International Mobile Equipment Identity(IMEI)information along with the International Mobile Subscriber Identity(IMSI),while malicious applications collect multiple pieces of information such as IMEI,IMSI,and geographic location.If the relationship between sensitive information flows is not considered,the feature representations of benign and malicious applications will be consistent,making it difficult to distinguish between the two.However,there is currently limited research on the analysis of information flow relationships.Application markets review and require developers to upload privacy policy documents to explain the collection of sensitive information.However,there is a lack of strict regulation,and users often ignore these policies when downloading applications.Even if they do read them,it is difficult to judge whether the actual behavior of the application is consistent with the stated behavior.The current research on the consistency analysis of sensitive behavior in applications and privacy policies is insufficient,which leads to widespread violations of personal information collection,inadequate notice of personal information collection,and over-collection of personal information.Furthermore,Android application programs integrate a large number of third-party components,and some components often obtain system resources beyond the scope and abuse sensitive information.The information flow within these components is very complex,and their direct access to system resources makes it difficult to achieve efficient fine-grained control.Furthermore,in practical applications,developers and users often have different functional and security requirements for different components.It is necessary to ensure the use and processing of sensitive information within a secure range while ensuring the basic functions are not affected.Therefore,a flexible mechanism is required to control their access to system resources.However,traditional access control techniques are difficult to achieve finegrained dynamic information flow control over component-sensitive behaviors.In response to these urgent issues,the main work of this article is as follows:(1)To address the limitations of traditional single information flow feature descriptions in characterizing different behavior patterns between benign and malicious application programs,as well as insufficient research on fine-grained information flow relationship features,a malicious software detection method based on information flow relationship features is proposed.In the sensitive information flow analysis stage,this method further explores the relationship features between information flows and provides a detailed formal description of the relationship between sensitive information flows.Then,a method utilizing dynamic programming is designed to analyze the relationship between sensitive information flows.The analysis identifies five types of relationships between sensitive information flows,including convergence,divergence,inclusion,connection,and crossover.For example,the benign application mentioned earlier has a convergence relationship between the IMEI and IMSI flows to the network interface,while the malicious application has a convergence relationship between the IMEI,IMSI,and location information flows to the network interface.Therefore,the different behavior patterns and information flow relationship features between benign and malicious applications can be characterized.In the feature construction stage,the relationship features are expressed as fivetuples,and the API in the continuous common subsequence is classified and expressed as sixtuples.Then,these two parts of the features are fused.In the detection stage,a machine learning model for malicious software detection is designed using convolutional neural networks.Finally,experimental results show that this method achieved an accuracy rate of 98.5% and 97.6% on the Mal Genome and Andro Zoo datasets,respectively.This demonstrates that the more fine-grained characterization of the relationship features between sensitive information flows plays an important role in distinguishing between benign and malicious applications.(2)It is widely found in application markets that application programs violate personal information collection regulations,have inadequate notification of personal information collection,and collect personal information beyond the scope of consent.However,there is a lack of targeted research on this issue.To address this,an Android application sensitive behavior and privacy policy consistency analysis method is proposed.In the privacy policy analysis stage,key information is extracted from the privacy policy statement document and the third-party information sharing list included therein based on the Bi-GRU-CRF neural network,and transformed into privacy policy three-tuples,namely entities,actions,and data types.In the sensitive behavior analysis stage,to match the granularity of the privacy policy analysis results with those of the sensitive behavior analysis results,the IFDS framework is optimized by classifying sensitive API calls,deleting previously analyzed sensitive API calls from the input sensitive source list,and marking previously extracted sensitive paths to reduce redundant analysis results and improve analysis efficiency.The extracted sensitive information flow is transformed into sensitive behavior two-tuples.In the consistency analysis stage,the semantic relationships between the ontology are defined as equivalence,subordination,and approximation.For these three types of relationships,semantic similarity is defined,and the consistency of sensitive behavior and privacy policy is classified into clear and fuzzy descriptions of consistency,and inconsistent descriptions of omission,incorrectness,and ambiguity.Finally,the proposed semantic similarity-based consistency analysis algorithm is used to analyze the consistency between sensitive behavior and privacy policy.Experimental results show that among the 928 applications analyzed,51.4% of applications have inconsistencies between sensitive behavior and privacy policy statements.(3)To address the problem that traditional access control techniques are difficult to achieve fine-grained dynamic information flow control over component-sensitive behaviors.A dynamic control method of component level sensitive behavior based on decentralized information flow strategy is proposed.This method first extracts the sensitive information flow of the components through static analysis,identifies the involved components as untrusted third-party components,and analyzes the system resources involved in sensitive information flows.Based on SELinux mandatory access control rules,security domains are added to both the components and system resources,and security labels are added to the component domain based on the defined decentralized information flow control model.The component domain is assigned the ability to access system resources,and the security labels are dynamically adjusted during application runtime,achieving dynamic control over component-sensitive behavior in accessing and processing sensitive information at the system level.Experimental results demonstrate that this method can effectively control the sensitive behavior of untrusted third-party components in the application program with an accuracy rate of 98.7% and low performance overhead,ensuring that the use and processing of sensitive information by untrusted third-party components in the program are always within a secure range.

Keywords/Search Tags:

Android, Program Behavior Analysis, Information Flow, Privacy Policy, Decentralized Information Flow Control

PDF Full Text Request

Related items

1	The Research Of Android Taint Analysis Technology Based On Information Flow
2	Research On Android Malware And Privacy Leak Detection Method Based On Sensitive Data Flow
3	Research On Android Application Of Malicious Behavior Detection Based On Binder Information Flow
4	Research On Suspicious Program Analysis Technologies Based On Guidance Of Control Flow Information
5	Security Policy Specification And Analysis Method For Privacy Information Flows In Web Service Compositions
6	A Divide Algorithm Based On Information-flow Chart Build Forest And Design Information-flow On The PVS
7	Language-Based Privacy Protection In Android
8	Research And Implementation Of Source Codes Security Detection System On Android Platform
9	Research On The Fine-grained Information Flow Control Model For Web Platform
10	Research On The Model And Key Technologies Of Fine-grained Information Flow Control