| Nowadays,with the rapid development of modern science and technology,especially computer technology,and the continuous popularization of information systems,we have already been in the era of big data.Faced with such a huge amount of data,people seem helpless.In such an environment,data mining has evolved and matured.It integrates knowledge of many disciplines and fields,such as pattern recognition,artificial intelligence,machine learning and statistics,etc.Thus it has been highly valued in various fields.The ultimate goal of data mining is to mine useful knowledge from seemingly cluttered transaction data sets.How to mine association rules accurately and efficiently in massive data sets is a hot research direction,while mining frequent pattern is the first step.This paper mainly research on methods of frequent pattern mining.The following aspects are studied: Optimizing the data structure of DiffNodest and then applying it to the frequent itemsets mining algorithm;Using the improved BPSO algorithm to mine large but sparse data sets;Combining fuzzy sets theories with frequent itemsets mining algorithm to find fuzzy frequent itemsets.The main contents of this paper are as follows:(1)Studying the algorithms of frequent pattern mining deeply which based on the DiffNodeset data structure.A method based on the data structure of BNodeset is proposed for the problem of the algorithms which based on DiffNodeset.An optimized node coding method is used to encoding each node and then use varieties of optimized strategies to cut the search space of the algorithm.Through theoretical analysis and experimental verification,it is proved that the proposed algorithm improves the time efficiency of mining frequent itemsets,and also effectively reduces the memory usage of the algorithm during runtime.(2)It is difficult to process very large but sparse data sets based on the BNodeset data structure,so the association rule mining methods based on PSO algorithm is introduced and an improved methods based on BPSO to mining frequent itemsets is proposed.This method studies three aspects: fitness function design,initial population pretreatment and tailoring search space.Initial population pretreatment ensures that the particles have a reasonable initial fitness.Then we design a method of dynamic cutting search space,which decreases the dimension of data set during the process of running.Through theoretical analysis and experimental verification,it is proved that the proposed algorithm can significantly improve the mining efficiency.(3)Because the classical frequent itemsets mining algorithms main find the boolean relation between datas.It is difficult to mine the concept of inexact or fuzzy.So combined with the fuzzy set theories,a compressed tree structure is applied to the quantitative database to mine the fuzzy frequent itemsets and at the same time,an effective pruning strategy is used to reduce the search space of the algorithm.Through theoretical analysis and experimental verification,it is proved that the proposed algorithm can significantly improve the mining efficiency of frequent itemsets and reduce the memory consumption of the algorithm.In conclusion,this paper has made some progress in the research of data mining algorithm and it has certain research value.It has practical significance to apply the proposed algorithm to practice. |