| Traditional sequential pattern mining algorithm use support as a measure to mine frequent sequential patterns in massive data.With the practical application of the algorithm,researchers have realized that frequent sequential patterns are not necessarily interesting sequential patterns.Using appropriate interestingness measurement is helpful to mine interesting sequential patterns and obtain potentially interesting knowledge from a large number of data.Therefore,this paper studies the sequential pattern mining algorithm based on interest measurement methods,and the main work is as follows:(1)To tackle the problem that there are redundant subsequences in the mining results of SKopus,a top-6)interesting sequential pattern mining algorithm based on interestingness measurement7)0)0)(62)0),this article proposes an improved algorithm called SKopus M.This algorithm is based on the framework of SKopus,and uses the improved EFKN strategy to mine top-6)interesting supersequences in the search space.The experimental results show that there are no redundant subsequences in the mining results of SKopus M,and the mining results are richer and more diverse than those of SKopus.At the same time,compared with traditional support-based methods and the most advanced top-6)interesting sequential pattern mining algorithm,the mining results of SKopus M contain more interesting knowledge.In addition,in order to mine interesting long sequential patterns with more information and more meaningful knowledge,an improved interestingness measurement7)0)0)(62)0)that considers the length of the sequence is proposed.An improved algorithm for mining interesting long sequential patterns is obtained by integrating7)0)0)(62)0)into SKopus.The EFKN strategy is used to solve the problem that there are redundant subsequences in the mining results of the improved algorithm.In order to improve the efficiency of the improved algorithm,an improved vertical format data structure VDFK is used to accelerate the calculation of7)0)0)(62)0),and a new pruning strategy based on the upper bound of7)0)0)(62)0)is adopted to reduce the search space.Based on these improvements,this paper proposes an improved SKopus KM algorithm.The experimental results show that,compared with SKopus,SKopus KM is more capable of mining interesting long sequential patterns,and the diversity of mining results is also better.Compared with other methods,the average length of the result sequence of SKopus KM is longer,and the mining results contain more interesting knowledge.(2)Aiming at the limitation that SKopus algorithm can only mine the sequence database in which all itemsets only contain a single item,an interesting itemset sequence mining algorithm ISKopus based on7)0)0)(62)0)is proposed.By integrating4)-extension into SKopus,SKopus is extended to mining itemset sequence database,which broadens the application scope of SKopus.Use-step pruning strategy and-step pruning strategy to reduce the search space.The experimental results show that,compared with the traditional methods,ISKopus can better mine the interesting knowledge contained in the itemset sequence database.In order to further mine the interesting itemset sequences,a sequence interestingness measurement7)0)0)(62)0)-integrating itemset interest is proposed.Combining7)0)0)(62)0)-with the framework of ISKopus,an interesting itemset sequence mining algorithm ISKopus-based on7)0)0)(62)0)-is proposed.A reasonable upper bound of interest is designed to reduce the search space.The experimental results show that compared with ISKopus,the ISKopus-algorithm is more capable of mining interesting sequences containing interesting itemsets.Compared with traditional methods,the mining results of ISKopus-algorithm are more informative and contain more interesting knowledge. |