| Frequent pattern mining is a classical topic in data mining.One of the basic problems in the application of frequent pattern mining is that the frequent patterns mined from the database are usually huge and redundant.That's because the traditional mining methods discriminate the pattern based on the rigid support-confidence threshold,which may lead to the result that a pattern is probably recongnized as different patterns because of random noise in data by the mining system.The thesis is devoted to the study of the conscise representation of frequent itemsets based on the itemsets defined in disjunctive space in order to eliminate the redundancy in frequent pattern mining results.The main works of thesis are as follows:(1)For itemset patterns in the disjunctive space,our studies show that there are still redundancy caused by random noise,which characteristics are local and tiny perturbation.Therefore,the δ-neighborhood is defined.Based on δ-neighborhood,a new conscise representation of frequent patterns is proposed,and its characteristics,accuracy and strategy for restoring original itemsets are analyzed.Depth-first recursive search and heuristic search strategies are employed to design DCPM,a novel efficient mining algorithm.Experimental results show that the conscise set defined by the model is significantly smaller than the traditional disjunctive closed itemsets,and the average support error for restoring original frequent itemsets is also low.(2)In the process of δ-neighborhood partitioning,our studies show that there are overlaps among different δ-neighborhoods and the overlap phenomenon is widespread.Incorrect partition of overlapped itemsets will lead to an increase in support error and further lead to redundancy while restoring original itemsets.Therefore three ways,i.e.,the relay node,the cross node and the disjunctive support of the alternative set,are studied in order to solve the optimization of 8-neighborhood partition.Based on that,DCPM is improved and the new algorithm called NDCPM is proposed.And some effective techniques used in DCPM are employed to improve classic MEP and the result is the algorithm NFMEP.Experimental results show that NDCPM algorithm mining results are more accurate and NFMEP has higher runtime efficiency. |