Font Size: a A A

Research On Association Rule Mining Of Transaction Data Based On Information Entropy

Posted on:2020-12-03Degree:MasterType:Thesis
Country:ChinaCandidate:G Y ChengFull Text:PDF
GTID:2417330590960720Subject:Statistics
Abstract/Summary:PDF Full Text Request
Data mining is a new technology which is flourishing in the era of big data.It uses computer technology to analyze huge and complex databases and solve problems that traditional statistics can not solve.Association analysis is an important branch of data mining research,also known as association rule mining.It is mainly oriented to transactional data,which is used to explore the association relationship between things.As a classical algorithm in association rule mining,FP-Growth finds the association relationship between items in data set by dividing and conquering.However,due to the defect of "equality and consistency" of the items in the view data set,some important association relationships will be omitted in the mining process.Therefore,weighted association rule mining arises at the historic moment.However,the existing weighted association rules algorithm also has limitations: it does not take into account the degree of confusion or uncertainty of the data set system itself.In this paper,an improved weighted association rule mining algorithm is proposed by studying transactional data,which can effectively deal with highly chaotic transaction data sets and discover more potential or valuable association relationships.This paper mainly includes the following four aspects: Firstly,from the perspective of traditional statistics and data mining,this paper studies the association between things,and summarizies the types and characteristics of transaction data.Secondly,this paper studies and discusses the related theories of association analysis,and analyzies the classical association algorithm which regards all items in the data set as "equality and consistency".Thirdly,aiming at the problem that the existing weighted association algorithm can not solve the confusion degree of the data set system itself,which leads to the potential omission of association relations in the mining results,this paper introduces the relevant theory of information entropy and proposes an improved algorithm of weighted association rules mining based on FP-Growth,which is called IEFP-Growth.Fourthly,the classical FP-Growth and IEFP-Growth algorithms are used to mine association rules in Crime datasets,and the results of association are analyzed and compared.It is found that the improved algorithm can find valuable association rules which are different from the classical algorithm,and the applicable conditions of the algorithm are studied.At the same time,the algorithm's applicability of different data sets is verified by mining association rules in the data set called IMDB.This study is mainly concluded as follows: Firstly,the item weighting is the necessary improvement in the process of association rule mining algorithm,because the importance of each project is different.Secondly,the improved algorithm——IEFP-Growth,can effectively mine the association in data sets by introducing the information entropy weighting model to quantify the uncertainty of information when dealing with huge transactional data sets.Thirdly,compared with the classical association algorithm,the improved association algorithm has the same and different results,and can find some potential or valuable association rules when mining association relations.It has certain applicable conditions.In practical application,if the two algorithms are combined,the association relationship can be more rich and complete.
Keywords/Search Tags:transaction data, association rules, FP-Growth, information entropy
PDF Full Text Request
Related items