Font Size: a A A

Research On Multi-level Association Rules Algorithm And Decision Tree Algorithm In Data Mining

Posted on:2022-08-06Degree:MasterType:Thesis
Country:ChinaCandidate:Q GuoFull Text:PDF
GTID:2558307145962109Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the information age,with the rapid development of life sciences and information technology,the application era of big data has also reached a new stage.Due to the large amount of data used by humans,people cannot continuously obtain useful data from a large amount of various types of data.Scientific knowledge,so many data mining techniques for processing a large amount of data are produced.Data mining is to continuously extract various processing models or calculation rules that are valuable to people from all kinds of messy data.The superiority of this model or rule Will promote data mining processing technology to be widely used in all walks of life.This paper mainly studies the multi-level association rule algorithm and decision tree algorithm in data mining.The main innovations are as follows:(1)This article improves the Cumulate algorithm.Cumulate has two main problems,more redundant itemsets and multiple scans of the database.This article mainly aims to improve the redundant itemsets in the algorithm.The redundant item sets are reduced by improving the candidate 2 item sets.The improvement method is to change the execution order of the algorithm steps to reduce the space complexity of the algorithm,and then through the hashing technique,the candidate item sets are mapped to the bucket.The screening of the hash function is followed by the screening of the minimum support degree,thereby reducing the redundant item set.Through the hashing technology,the shortcomings of the redundant item set of the multi-layer association rule algorithm are improved to a certain extent,and the execution efficiency of the algorithm is improved;Case analysis and experiments with simulated data prove that it has a high operating efficiency,and then apply the improved algorithm to the recruitment data of the recruitment design director to prove its usability.(2)Mainly improve the ID3 algorithm.The ID3 algorithm has the shortcoming of multi-value bias.In response to this shortcoming,this paper proposes to use the correlation function and prior knowledge to improve the accuracy of the algorithm,which reduces the problem of multi-value bias to a certain extent,and uses Mc Laughlin’s formula to reduce time waste.,Improve the operating efficiency of the algorithm.The above-mentioned improved ID3 algorithm is illustrated by examples and the UCI data set and the use of the weka tool for experimental analysis.It shows that the improved algorithm has indeed improved the accuracy rate.The improved algorithm is applied to the actual film review data set to prove its usability.
Keywords/Search Tags:Data Mining, Multi-level Association Rule Algorithm, Cumulate Algorithm, Decision Tree Algorithm, ID3 Algorithm
PDF Full Text Request
Related items