Font Size: a A A

Mining Local Periodic Patterns In A Discrete Sequence

Posted on:2020-02-02Degree:MasterType:Thesis
Country:ChinaCandidate:P YangFull Text:PDF
GTID:2428330611999748Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Mining frequent patterns is a traditional data mining task,which aims at finding all itemsets?sets of items or symbols?appearing frequently in a transactional database.Because the size of the search space for this task grows exponentially with the number of distinct items and that a huge number of patterns may be found,developing efficient algorithms that select a small set of interesting patterns has become an important problem.Several definitions of what is an interesting frequent pattern have been proposed,each suitable for different applications.However,many of them do not consider the time dimension.Periodic pattern mining is an emerging task extending frequent pattern mining,which consists of finding the sets of frequent events or items that periodically appear in a sequence of events or transactions.Many algorithms have been designed to identify periodic frequent patterns in data.However,most of them are designed based on an implicit assumption that the periodic behavior of a pattern does not change much over time.This assumption is unrealistic for many real-life applications as the frequency of an event may vary over time,and a pattern may only be periodic in some time-intervals rather than in the whole database.As a result,traditional periodic pattern mining algorithms may miss several partially periodic patterns.To address this limitation,this dissertation proposes to discover a novel type of periodic patterns in a sequence of events or transactions,called Local Periodic Pattern?LPP?which are patterns?set of events?that have a periodic behavior in some non predefined time-intervals.A pattern is said to be a local periodic pattern if it appears regularly and continuously in some time-interval?s?.This dissertation applies the cumulative sum to consider the relationship between each period and the previous ones.This definition of periodicity has the advantage of being more noise-tolerant since a pattern may have a value greater than user-defined thresholds and not be discarded at some time points because of the contribution of the preceding periods to the cumulative sum.Finally,two novel measures are proposed to assess the periodicity and frequency of patterns in time-intervals.The max So Per?maximal period of spillovers?measure allows detecting time-intervals of variable lengths where a pattern is continuously periodic,while the min Dur?minimal duration?measure ensures that those time-intervals have a minimalduration.To discover all LPPs,we present three efficient algorithms named LPPMbr eadth,LPPMdepthand LPP-Growth,which respectively adopt a breadth-first,depth-first and a pattern-growth approach to enumerate all desired patterns.The first two alorithms adopt a binary database representation,while the last one compresses the database using a compact tree structure to discover the complete set of local periodic patterns.The algorithms rely on novel techniques for reducing the search space to ensure that local periodic patterns are efficiently found.An experimental evaluation on real datasets shows that the proposed algorithms are efficient and can provide useful patterns that cannot be found using traditional periodic pattern mining algorithms.
Keywords/Search Tags:data mining, periodic pattern mining, local periodic pattern
PDF Full Text Request
Related items