Font Size: a A A

Research On Optimization Of FP-growth Algorithm Based On Hadoop And Weighted Model

Posted on:2020-10-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y LiFull Text:PDF
GTID:2438330626464263Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Data mining is to analyze data sources in a certain way to find some potentially useful information,so data mining is also called knowledge discovery.And association rule mining is an important topic in data mining,which is to discover the association between things behind the data.Nowadays,association rule mining technology has been widely applied in many fields such as finance,Internet,medical treatment,etc.,and scholars’ research enthusiasm for association rule mining algorithm is increasingly high.Traditional association rule mining algorithms assume that things have the same importance and are evenly distributed.However,in actual production and life,things often have different importance and are unevenly distributed.Therefore,this paper studies the weighted association rule mining algorithm.In this paper,the classical fp-growth algorithm is improved by introducing the weighted model.On the one hand,the ordered FP tree is adopted to replace the traditional FP tree,so as to reduce the utilization of storage space.On the other hand,the weighted support of two-dimensional list records is adopted to eliminate the first traversal of conditional pattern base when generating weighted conditional FP subtree.Based on the increasing amount of data to be processed in association rule mining,Hadoop distributed system architecture emerges as The Times require,and mass data processing is no longer a problem.In this paper,Map Reduce parallel computing framework in Hadoop is adopted to process data sets,and a balanced grouping strategy is proposed to avoid the generation of data skew.Distributed data processing reduces the time complexity and enables efficient mining of association rules for massive data.This paper studies the FP-growth algorithm based on hadoop and weighted model,explained the implementation steps of the improved algorithm in detail,the derivation and experimental verification,fully proved that Weighted ordered FP algorithm based on the Map Reduce is better than the traditional weighted FP-growthalgorithm,it can better adapt to the big data,shorten the operation time of the algorithm,the efficiency was improved.
Keywords/Search Tags:Association rules, Ordered FP-tree, Two-dimensional list, Weighted model, MapReduce
PDF Full Text Request
Related items