Font Size: a A A

Ensemble Learning Method And Its Application In Telecom Churn Prediction

Posted on:2017-01-13Degree:MasterType:Thesis
Country:ChinaCandidate:C Y HuangFull Text:PDF
GTID:2348330503485513Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Classifier ensemble is an important research direction of machine learning and pattern recognition. Due to the classifier ensemble can often get a better performance than a single classifier, therefore it attracted wide attention. Previous studies indicated that, there are three very important issues relevant to the performance of classifier ensemble:(1) the accuracy of classifier components;(2) the diversity of classifiers;(3) the strategy about how to combine available classifiers.According to the study of the two existing algorithms, we propose an improved algorithm about classifier ensemble, DELMBR. In this algorithm, training data for base classifiers are built by taking a bootstrap sample of the original training data set and then manipulating a set of arbitrary attributes of each pattern. For a series of available classifiers, the weight of base classifier is determined by a naive elastic net criterion. The improved algorithm has three advantages: Firstly, on the aspect of the accuracy of classifier components, we choose parameters by cross validation; Secondly, on the aspect of the diversity of classifiers, training data for base classifiers are built by taking a bootstrap sample of the original training data set and then manipulating a set of arbitrary attributes of each pattern, so the classifiers we got have great diversity; Thirdly, on the aspect of strategy about combination of available classifiers, the weight of base classifier is determined by a naive elastic net criterion, it leads to the sparsity learning for combining multiple classifiers, thus remove the base classifier which has small contribution to the ensemble system, and also make the weight of base classifiers shrink to zero, so that the ensemble system is more robust.Finally, for the problem about Telecom Churn Prediction, we establish churn prediction model respectively base on Bagging algorithm, Adaboost algorithm and DELMBR. Experimental results show that the model base on DELMBR is more suitable for this real application problem. To further illustrate, we show the cross-validation results as follow: Compare with Bagging algorithm, the precision, recall and accuracy are respectively increased by 3.21%, 0.39%, 0.81%. That is to say, the model base on DELMBR algorithm has the almost same recall and accuracy as Bagging algorithm, while the precision has been significantly increased; Compare with Adaboost algorithm, the precision and accuracy of DELMBR algorithm is respectively 0.89%, 0.33% lower than Adaboost algorithm, while the recall is increased by 2.34%. However, for the customer churn prediction, we are more concerned about the effective prediction of the loss of users, that is recall. Therefore, the customer churn prediction model based on DELMBR algorithm is more in line with the actual demand.
Keywords/Search Tags:classifier ensemble, diversity of classifiers, sparsity learning, naive elastic net criterion, least squares
PDF Full Text Request
Related items