Font Size: a A A

The Construction And Application Of The Warning Model Of Telecom Customer Churn

Posted on:2016-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:P DuanFull Text:PDF
GTID:2309330482465725Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
In recent years, with the vigorous development of the telecommunications industry, communications market is gradually saturated, formed a "three pillars" of the situation, China Mobile, China Telecom, China Unicom are moving their focus of competition from the new users to the stock of the user. In fact, most of previous studies did not consider the impact of the imbalanced data for the prediction, so this article wants to explore telecom customer churn warning research based on imbalanced data on the basis of predecessors’ exploration.This article’s data come from a certain company’s real data. For imbalanced data, this paper proposes innovatively hybrid sampling technique, including the improved SMOTE algorithm based on density clustering and subsampling based on clustering algorithm. The core ideal of the improved SMOTE algorithm based on density clustering is based on the minority of samples clustering in order to identify the noise samples. Then the noise samples obtained by clustering oversampling uses SMOTE algorithm. However, by improving SMOTE of oversampling technology is the proportion of positive and negative samples are still comparatively large difference, therefore, in this paper, the negative samples adopted subsampling technology based on the clustering algorithm to deal with the negative samples, this paper uses the system clustering method of the sum of squared residuals, the K-Means clustering and PAM clustering, then according to the clustering of the category label according to a hierarchical sampling to extract certain proportion, so that the data is balanced.With the final balanced data, respectively, using the decision tree method and the random forest method to build model, through precision,recall,F-value and AUC value to evaluate model, concluded that the random forest integration model based on decision tree works best.Need to be done in a timely manner after completion of customer churn prediction to maintain and retain customers, can we truly achieve the customer retention. This paper creatively potential loss customers in accordance with the value degree, risk degree and stability degree for customer segmentation, get several customers and characteristics, on the basis of main products of enterprise resources and the final strategy matching, eventually get accurate marketing strategy of each retain users.
Keywords/Search Tags:imbalanced data, density clustering, SMOTE, subsampling, decision trees, random forest
PDF Full Text Request
Related items