Font Size: a A A

Algorithm For Mining Frequent Itemsets And Its Optimization

Posted on:2019-05-31Degree:MasterType:Thesis
Country:ChinaCandidate:S WuFull Text:PDF
GTID:2417330563493057Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
Nowadays,as the high pace of information technology development,the data stored in the database is increasing at an amazing speed.In order to use these data better,data mining technology emerges as the times require.As an important branch of data mining,association rules are used to discover interesting patterns in data and have been widely studied and applied in various fields.The whole process of mining association rules can be roughly divided into two steps: the first step is to find all the frequent itemsets in the transaction database,and the second step is to generate corresponding association rules based on the frequent itemsets obtained in the first step.In these two steps,the first step is the key,because first of all,the mining of frequent itemsets is very time consuming;secondly,after determining the frequent itemsets in the first step,the solution of the second step becomes quite simple.This paper first introduces the knowledge of data mining,and then focuses on the algorithms for mining frequent itemsets,including the most classical Apriori algorithm and FP-growth algorithm,and in view of the shortcomings of the traditional FP-growth algorithm,some improvements are made to it,and it is proposed that for the single path conditional FP tree,frequent itemsets can be obtained only by arranging and combining.In the last part of this paper,I have implemented all algorithms using the most popular programming language Python,and then compare the performance and efficiency of each algorithm through a series of comparative experiments.It also proves that the improved algorithm can improve the efficiency of frequent itemsets mining on large scale data sets.
Keywords/Search Tags:Apriori algorithm, FP-growth algorithm, Python
PDF Full Text Request
Related items