Analyzing And Predicting Customer's Churn In Telecommunications Industry Using Data Mining | | Posted on:2004-04-09 | Degree:Master | Type:Thesis | | Country:China | Candidate:P Wang | Full Text:PDF | | GTID:2168360122960256 | Subject:Computer application technology | | Abstract/Summary: | PDF Full Text Request | | Customer's frequent loss is a serious problem in the mobile telecommunications market. This problem will be deteriorated with foreign telecom companies' coming. In order to combat the high cost of churn, the thesis gives a feasible solution: first, build a prediction model for customer's churn employing data mining technology; then, use the model to analyze why customers churn and which customers are most likely to churn in the future; finally, make better target recruitment campaigns by summarizing customer's calling behavior and hobby to increase retention. The whole paper discusses how to build the model in four stages: business question definition, data preparation, model building, model optimization and evaluation.The first stage explains the questions the model will solve and the goals it pursues. The second stage solves the problems such as how to select dataset, minimize "noise", normalize values and especially select attributes. There are three means to decrease the number of attributes: delete irrelevant attributes to the task using Fisher's Discriminant Ratio; merge correlate attributes according to Pearson's Correlation Coefficient; reduce the dimensionality of the attribute vector by Singular Value Decomposition.The third stage building model involves customer's classification and churn prediction. The purpose of customer's classification is to get different cluster which has common calling behavior, and then the prediction model will be built based on these different clusters. A modified k-means method which can reduce compute complexity greatly is proposed to cluster similar customers.Churn prediction adopts decision trees algorithms. After presenting a brief overview of tree-building algorithm and tree-pruning algorithm of traditional decision trees, the paper describes how to push constraints into the tree-building phase and tree-pruning phase in detail. By computing the cost of the cheapest subtree with size constraints ofthe partial tree (this is an upper bound on the cost of the final optimal tree) and lower bounds on the cost of subtrees of varying sizes that are rooted at nodes of the partial tree, the algorithms can identify and prune nodes that cannot possibly belong to the optimal constrainted subtree. The method pushing size constraints into tree-building phase is applied in the prediction system. When splitting nodes of tree, gini index is chosen as a splitting criterion and CAIM measure is used to transform continuous attributes into discrete ones.In order to get better accuracy, boosting method is used for voting classification algorithms. Finally, the experiment results are explained. | | Keywords/Search Tags: | Customer loss, decision trees, clustering, size constraints, attribute extract, correlation analysis, CAIM algorithm, boosting | PDF Full Text Request | Related items |
| |
|