Font Size: a A A

Generalized Correlation Analysis Implies - Bound Framework And Mining Algorithm

Posted on:2015-03-05Degree:MasterType:Thesis
Country:ChinaCandidate:M Q ZouFull Text:PDF
GTID:2268330431967526Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The main purpose of association analysis which is also known as association mining is to find frequent patterns, associations, correlations or causal structures existing in the items or objects from transaction data, relational data or other information carriers. In this paper, any association analysis based on transactions or non-transactions is called as generalized association analysis. Association analysis based on transactions relies on support-confidence framework while other association analysis is similar to the former one. For example, the participation-conditional probability framework is used in spatial co-location pattern mining. Firstly, we present the implication-constraint framework while we present a evaluation system as correctness, reliability and interest for strong association rules. Secondly, we reduce the range of min_conf from (0,1] to (0.5,1] while min_sup from (0,1] to (0, min_conf). Thirdly, we present a method called as maximum clique of random vertex divisions as the method can transform the association analysis based on non-transactions to one based on transactions. Meanwhile, the method can get new divisions only by modifying part of old divisions after data update such as adding, deleting and altering. Fourthly, we introduce the concept of mapping from the advanced mathematics, and then classify the constraints into before constraint, during constraint and afterwards constraint. For each constraint, we formally define how to use it. Fifthly, we present an algorithm called as multi-dimensions and multi-layers algorithm which applies to most of proble on association analysis and be based on the monotonicity of support Sixthly,we present a storage structure named multi-knowledge tree which can be effectively reduce the storage space of data. A new algorithm called as multi-pruning algorithm is proposed as the algorithm can get the result timely after data updating. Lastly, we check the theory and efficiency by lots of experiments.
Keywords/Search Tags:Generalized association analysis, Implication-constraint framework, Maximum clique of random vertex divisions, Reasonable threshold, Multi-knowledgetree
PDF Full Text Request
Related items