Font Size: a A A

Research On Frequent Pattern Mining Algorithms For Privacy-preserving

Posted on:2018-09-27Degree:MasterType:Thesis
Country:ChinaCandidate:T WangFull Text:PDF
GTID:2518306248982979Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Data mining has become the research hot spot because it can obtain potential,correct and valuable knowledge from massive data.Association rule mining is one of the core research branches of data mining,and as a key step of association rule mining,as a key step of association rule mining,frequent pattern mining is of greater value.With the frequent occurrence of privacy leaks,data privacy is facing greatly threatening.Due to the addition of privacy protection,efficiency has become the bottleneck of frequent pattern mining algorithms for privacy-preserving.Therefore,it is an urgent problem to improve the efficiency of frequent pattern mining algorithms for privacy-preserving,and a small amount of database updates also need to re-run the original algorithm,often resulting in a great waste of mining results before,an update model must be added to the mining algorithm to improve its efficiency.This paper launches the research work on view of the efficiency problem of privacy-preserving frequent pattern mining algorithm,through in-depth analysis of factors in restricting the efficient of privacy-preserving frequent pattern mining algorithm,and then proposes the improved algorithm,the main achievement are as follows:1)Aimed at the problem that bit-and operation and the count of a large number of subsets consume plenty of time and system memory for privacy-preserving data mining algorithm BEMASK in tens of thousands of candidate sets,the PBEK algorithm is proposed.PBEK is based on partition and the prior constraint.In the new algorithm,the database is divided into several homogeneous overlapping pieces,and the BEMASK algorithm is run on each small piece.Through the prior constraint,the number of local frequent item sets is reduced,and the global frequent itemsets are generated efficiently by scanning the global bitmaps.The results of the experiment show that the improved algorithm has higher efficiency than the BEMASK algorithm.2)In order to solve the problem that the GRRPH algorithm needs multiple database scans during the running time and counting several times,this paper proposes the improved algorithm BRRPH of GRRPH.BRRPH algorithm uses bitmap technology to represent transaction in the database,the results of the experiment show that the improved algorithm has higher efficiency than the RRPH algorithm.Aimed at the frequent or occasional additions to the database will make some of the existing strong association rules invalid or weak rules become strong rules,this paper introduces an incremental update model,and then proposed the IBRRPH algorithm.The results of the experiment show that the improved algorithm has higher efficiency than the RRPH algorithm.
Keywords/Search Tags:Privacy-preserving, Data mining, BEMASK, RRPH
PDF Full Text Request
Related items