Analyzing And Predicting Customer's Churn In Telecommunications Industry Using Data Mining

Posted on:2004-04-09

Degree:Master

Type:Thesis

Country:China

Candidate:P Wang

Full Text:PDF

GTID:2168360122960256

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Customer's frequent loss is a serious problem in the mobile telecommunications market. This problem will be deteriorated with foreign telecom companies' coming. In order to combat the high cost of churn, the thesis gives a feasible solution: first, build a prediction model for customer's churn employing data mining technology; then, use the model to analyze why customers churn and which customers are most likely to churn in the future; finally, make better target recruitment campaigns by summarizing customer's calling behavior and hobby to increase retention. The whole paper discusses how to build the model in four stages: business question definition, data preparation, model building, model optimization and evaluation.The first stage explains the questions the model will solve and the goals it pursues. The second stage solves the problems such as how to select dataset, minimize "noise", normalize values and especially select attributes. There are three means to decrease the number of attributes: delete irrelevant attributes to the task using Fisher's Discriminant Ratio; merge correlate attributes according to Pearson's Correlation Coefficient; reduce the dimensionality of the attribute vector by Singular Value Decomposition.The third stage building model involves customer's classification and churn prediction. The purpose of customer's classification is to get different cluster which has common calling behavior, and then the prediction model will be built based on these different clusters. A modified k-means method which can reduce compute complexity greatly is proposed to cluster similar customers.Churn prediction adopts decision trees algorithms. After presenting a brief overview of tree-building algorithm and tree-pruning algorithm of traditional decision trees, the paper describes how to push constraints into the tree-building phase and tree-pruning phase in detail. By computing the cost of the cheapest subtree with size constraints ofthe partial tree (this is an upper bound on the cost of the final optimal tree) and lower bounds on the cost of subtrees of varying sizes that are rooted at nodes of the partial tree, the algorithms can identify and prune nodes that cannot possibly belong to the optimal constrainted subtree. The method pushing size constraints into tree-building phase is applied in the prediction system. When splitting nodes of tree, gini index is chosen as a splitting criterion and CAIM measure is used to transform continuous attributes into discrete ones.In order to get better accuracy, boosting method is used for voting classification algorithms. Finally, the experiment results are explained.

Keywords/Search Tags:

Customer loss, decision trees, clustering, size constraints, attribute extract, correlation analysis, CAIM algorithm, boosting

PDF Full Text Request

Related items

1	Research And Application On The Retail Bank’s Customer Accurate Classification Model Based On Hadoop
2	Design And Implementation For A Bank Customer Loss Warning System
3	Research On Semi-supervised Size Constrained Clustering
4	Aviation Customer Value Assessment And Churn Prediction Model Based On Data Mining Analysis
5	Application Of Clustering Algorithm Based On Attribute Weighting In Bank Customer Segmentation
6	Customer Churned Analysis Based On Decision Trees Algorithm
7	Research On Clustering Algorithm For Mixed Attributes And Application
8	Face Attribute Recognition Based On Tree Structure
9	Research On Fair Privacy Gradient Boosting Decision Tree System Based On Trusted Execution Environment
10	Research And Application Of Data Mining Technology In Telecommunication Customer Loss