Font Size: a A A

Research On Privacy Protection Of Information Sharing For Utility Minin

Posted on:2024-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2568307106477994Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the booming development of internet technology,data plays a crucial role in the prosperity of the socio-economy.Utility mining can extract valuable information from massive and complex data,which brings new opportunities for the development of enterprises or institutions.However,as privacy breaches and abuses occur frequently,people are becoming increasingly aware of the importance of privacy preservation.Therefore,data owners typically perform special processing on sensitive information hidden in the data before sharing it to protect their interests from being infringed upon.Privacy preserving utility mining considers the concept of utility based on privacy preserving data mining.The utility can be used to measure the value of an item or service and expresses the subjective preferences of users to some extent.Compared to traditional data mining,utility mining can better uncover data that satisfies people.However,the data often contains sensitive information.If it is utilized by untrusted third parties,it may cause certain losses to the data owner.Therefore,to solve the privacy preservation problem in utility mining,researchers have proposed many privacy preserving algorithms.Traditional privacy preserving methods mostly involve encryption and perturbation,which reduce the data’s analytical capability and are rarely used by utility mining.Additionally,according to the problems for privacy preservation in utility mining,most current methods have high time complexity and side effects for algorithms.What’s more,these methods usually lose more information seriously after sanitization.To solve these issues,the thesis conducts a comprehensive investigation and research on the privacy preservation problem under utility mining and makes the following innovations and contributions:(1)The thesis proposes the concept of Hiding Cost to improve the evaluation index of privacy preserving utility mining.Based on the traditional evaluation indicators of Hiding Failure,Missing Cost,and Artificial Cost,Hiding Cost adds weights to these three indicators respectively.The size of the weights can be adjusted according to the actual situation,which makes the evaluation index of the algorithm more flexible to adapt to different application scenarios.(2)To overcome the shortcomings of traditional privacy preserving utility mining algorithms such as high time complexity and significant side effects,the BCUTD algorithm is proposed.The algorithm designs a new type of tree structure,called the BCU-Tree,to address these issues.In the tree,all sensitive nodes in the tree are encoded using bitwise operations and store the corresponding utility information for sensitive items to avoid repeated utility calculations.Additionally,the algorithm creates a dictionary table to mark all sensitive items’ corresponding node information in the tree while constructing the tree,avoiding redundant tree traversal during sanitization.Compared with the traditional FPUTT-Tree,the BCU-Tree is faster to construct and provides better sanitization effects.To verify the algorithm’s performance,four representative datasets were selected,and the BCUTD algorithm was compared and analyzed against five other algorithms.Through extensive experimentation,the performance of the BCUTD algorithm has been proven to improve by 2-5 times.(3)At present,traditional privacy preserving utility mining algorithms usually use precise algorithms,heuristic algorithms,and other methods to accelerate the sanitization process.However,these algorithms cannot achieve a good balance between time complexity and side effects.Therefore,the thesis innovatively proposes a new data structure,the utility-list buffer.Unlike the traditional utility list,the structure is first applied in privacy preserving utility mining.The utility-list buffer proposed in the thesis consists of sensitive items and UTLists.The UTLists store all sensitive information corresponding to sensitive items in the database.Compared with traditional algorithms,this structure greatly reduces computational complexity during sanitization.To reduce the side effects,the algorithm also proposes new concepts of tns and SINS.The tns represents the relationship between transactions and non-sensitive high utility itemsets.The SINS reflects the number of times each sensitive item appears in non-sensitive itemsets.Through an extensive range of experiments,it is proved that the FULB algorithm is 15-20 times faster than traditional algorithms,and the side effects of the algorithm are smaller.
Keywords/Search Tags:Privacy preservation, High utility mining, Bitmap code, Utility list, Dictionary
PDF Full Text Request
Related items