Font Size: a A A

Research On Frequent Sequential Pattern Mining Compression Using Approximate Partial Order

Posted on:2008-11-29Degree:MasterType:Thesis
Country:ChinaCandidate:H W DanFull Text:PDF
GTID:2178360212984937Subject:Computer applications
Abstract/Summary:PDF Full Text Request
As the developing of computer science and universal using of network, extracting knowledge from massive data is more and more difficult. And the bottleneck of many data mining issues is not the efficiency but the quality of the mined patterns.In this paper, we propose a novel post-processing approach to summarize frequent sequential patterns. Using approximate partial order (ApproxPO), we can compress the sequential patterns and extract the order information. We propose a uniform measure function to balance the efficiency and quality, and describe the process of obtaining approximate partial order from clusters. A thorough experimental study with both real and synthetic datasets shows that ApproxPO can compress sequences into high-quality partial orders efficiently. The algorithm have the following processes: Data Preparing: Cleaning the original data, generating frequent sequential patterns or closed sequential patterns, which will be used by the ApproxPO algorithm. Distances Definition: definiting three different distances to measure the distance of patterns, and using these distances to clustering sequential patterns into clusters. Clustering: Clustering sequential patterns using two kinds of clustering methods(k-means Clustering and Hierarchical Clustering) with three different distances. Partial Order Generating: Generating approximate partial orders from sequential pattern clusters ,we introduced the generating methods and algorithm. Experiments: Evaluating the efficiency of ApproxPO with different distances and different clustering methods, and also evaluating the quality of the generated partial order.
Keywords/Search Tags:Data Mining, Sequential Pattern, Compression, Partial Order
PDF Full Text Request
Related items