Font Size: a A A

Construct Ensembles Of Bayes-based Classifiers Using PCA And AdaBoost

Posted on:2011-05-19Degree:MasterType:Thesis
Country:ChinaCandidate:S F ChenFull Text:PDF
GTID:2178330332458212Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Classification, as an important branch of Data mining, has many mature techniques which are widely applied to medical treatment, finance, commerce, telecommunication and science research, etc. A key issue in classification, is how to improving classification accuracy. As an effective method to enhance the performance of a weak classifier, combination method has become a focus research subject in classification techniques.Combination method, as an effective method to enhance the performance of a weak classifier, was developed gradually from Machine Learning. Combination was considered as one of the most effective learning methods presented in the last decade. At present, combination is still the focus research subject in machine learning and patern recognition.We especially introduced Bayes classifier in the paper. Bayes classifier have exhibited high accuracy and speed when applied to large dataset. We also especially analysed the characteristics of the Rotation Forest and AdaBoost. Considering the characteristics of the two combination methods and the above-mentioned factors, we present a novel method for constructing ensembles of Bayes-based classifiers.In the paper, we present a novel method for constructing ensembles of Bayes-based classifiers called PCABoostBayes. For creating a training data, our method splits the features set into K-subsets randomly, and applies principal component analysis to each of the feature subsets to get its corresponding principal components. And then all of principal components are put together to form a new feature space into which the total original dataset are mapped to create a new training set. Different process can generate different feature space and different training sets. On each of the new training data we generate a group of classifiers which are boosted one by one using AdaBoost, so we can generate several different classifiers groups in the several different feature spaces. In the classification phase we firstly get several predicts using weighted-vote inside each of the classifiers groups, and then vote on the several predicts to get the final result as the ensemble's predict. Experiments are carried on 30 benchmark datasets picked up randomly from the UCI Machine Learning Repository, the results indicate that our method not only improves the performance of Bayes-based classifiers significantly, but also get higher accuracy on most of data sets than other ensemble methods such as Rotation Forest and AdaBoost.
Keywords/Search Tags:Data mining, Classifier ensemble, PCA, AdaBoost, Bayes
PDF Full Text Request
Related items