Font Size: a A A

Research On Improving Naive Bayes Classifiers And Its Application

Posted on:2017-01-05Degree:MasterType:Thesis
Country:ChinaCandidate:K X YuFull Text:PDF
GTID:2309330485470036Subject:Statistics
Abstract/Summary:PDF Full Text Request
Classification ability is one of the most important and basic ability in human social activities, the classification algorithm has become a core content of data mining. Naive Bayesian Classifier is famous for its perfect theory system, clear and simple structure, excellent adaptability and high classification accuracy, but the algorithm can’t suit well in practical application in some conditions. To solve this problem and improve the performance of the algorithm, in this paper, the research is based on Naive Bayesian Classifier algorithm, by studying all kinds of Attribute Weighted Naive Bayesian Classifiers, and considering the influence factors of the data types in actual application, carried out Weighted Naive Bayesian Classifier based on Tau-y correlation coefficient and Weighted Naive Bayesian Classifier based on the Kendall τ correlation coefficient. The specific research works are as follows:(1) By studying the Naive Bayesian Classifier algorithm and all kinds of improved algorithm, the study carries out an improved algorithm which is Weighted Naive Bayesian Classifier based on Tau-y correlation coefficient, by using Tau-y coefficient to determine the weights. The experiments verify its classification efficiency, the new improved algorithm can improve the efficiency of classification, especially in small sample data set which showed good classification accuracy.(2) In order to extract the useful information in the data, meanwhile, solve some specific problems in actual problem. The study introduces the Kendall tau correlation coefficient which is derived from the Nonparametric statistics, then carries out a new algorithm called Weighted Naive Bayesian Classifier based on Kendall τ correlation coefficient. The experiment results show that the algorithm can achieve higher classification accuracy, and performance better especially in the dataset which contains more class variable.(3) Use the new algorithm in solving the bank individual customer classification problems, its also verifies the performance of the new algorithm.
Keywords/Search Tags:Classification algorithm, Naive Bayesian Classifier, Weights, Correlation coefficient
PDF Full Text Request
Related items