Font Size: a A A

Research On Sequential Pattern Mining Algorithm Based On Constraints

Posted on:2023-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:W YeFull Text:PDF
GTID:2568306788456744Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the popularity of big data and the Internet of Things,the speed of data generation is getting faster and faster,and the amount of data is increasing.In order to find the available data from these massive data,data mining comes into being.Pattern mining is an important research field of data mining,its goal is to find some possible pattern association between things from data.Finding frequently common items from pattern mining is called frequent itemset mining.Sequential pattern mining is a deeper research of frequent itemset mining.It not only considers whether items appear in transaction database,but also considers the sequence of items appear.It is more widely used in real life,so it has more research value.Constraint based sequence pattern mining is based on the sequence pattern mining,the constraint is embedded in the mining process,so as to save a lot of time,space,and can be mined more suitable sequence pattern.In order to mine more suitable sequence patterns in a more efficient way,this paper proposes sequence pattern mining algorithm based on interest constraint and sequence pattern mining algorithm based on flexible constraint,respectively,based on MOOCs course selection data.Specifically,the main research contents include:1)A sequence pattern mining algorithm based on interest constraint is proposed to mine sequence patterns.First,unexpected support is designed to replace traditional support.Secondly,it is proved that the unexpected support also satisfies the downward closure attribute.Then,an item list structure and a sequence list structure are redefined,and a new sequence location list structure is proposed according to the characteristics of the data set.These three structures are used for pruning,which narrows the search space.Finally,this paper describes and explains the new algorithm FAST-USP in detail,and shows the experimental results from three aspects of running time,memory consumption and the number of mined patterns,which verifies the superiority of FASTUSP algorithm2)A novel sequential pattern mining algorithm based on flexible constraints is proposed to mine sequential patterns.Firstly,three kinds of constraints,namely length constraint,discreteness constraint and validity constraint,are proposed.They are used to describe the length of course selection sequence,the discreteness of course selection time,and the validity of course selection time.The three kinds of constraints are fused to form flexible constraints in a certain proportion.This flexible constraint is embedded into the support of the most important parameters in sequential pattern mining to form a support with flexible constraints.Finally,two new sequence pattern mining algorithms SPM-FC-L and SPM-FC-P are proposed to mine sequence patterns in apriori-like and pattern growth ways.The experimental results are shown from five aspects: running time,memory consumption,the number of mined patterns,the use of constraints,and mined pattern results,which verifies the superiority of the proposed algorithm.
Keywords/Search Tags:data mining, sequential pattern mining, unexpected support, flexible constraints, downward closing attribute
PDF Full Text Request
Related items