Font Size: a A A

The Study Of The Hierarchical Prediction Of Zhejiang Mobile Corporation Customer Churn Based On Stacking Ensemble Learning

Posted on:2019-02-21Degree:MasterType:Thesis
Country:ChinaCandidate:B B WangFull Text:PDF
GTID:2439330575950446Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
With the continuous development of the telecommunications industry in China,the maturity of 4G services has brought about a sharp jump in network speed.Customers are increasingly dependent on flow.Flow has become the biggest source of profit for mobile companies,accounting for more than 50%of the total cost.The different usage of flow reflects the different behavior characteristics of customers.Therefore,it is very important to build a hierarchical model of customers based on flow,analyze customer needs,improve our own service quality and then improve customer loyalty and satisfaction,and prevent customer churn.Based on the real customer data of Zhejiang Mobile Company,this paper divides the customers into low-flow customers and medium-high-flow customers according to the actual consumption of flow.According to the research time window,when the customers meet the four conditions:the ARPU conforming to the three consecutive months(November 2017 to January 2018)is more than 0 yuan,it is longer than 3 months in the net at the end of January 2018,for three consecutive months(November 2017 to January 2018)MOU is greater than 0,March 2018 MOU is equal to 0.Mark these users as positive samples.Combined with basic attributes,communication characteristics and flow characteristics to analyze the loss characteristics of low-flow customers and medium-high-flow customers,it can be concluded that the loss of low-flow customers is significantly related to telephone behavior,while the loss of high-flow customers is significantly related to flow behavior.Based on business experience,this paper constructs a pre-selected index system for layered prediction of customer churn in Zhejiang mobile company,designs a five-month time window,a total of 11 categories of 121 indicators including basic attributes,accounting behavior,call behavior,flow behavior,SMS behavior,product stickiness,complaint behavior,contact circle behavior,contract binding,terminal behavior,two-network behavior were selected.Because of the high dimensionality of the index,this paper firstly deletes some variables through data cleaning and correlation test,then deletes some variables based on IV value,constructs the index system for low-flow customers and medium-high-flow customers respectively.Through comparison,it can be found that the index system established by medium-high-flow customers retains more flow indexes,which also ensures the rationality of stratification based on flow.Finally,the dimensionality is further reduced by using the autoencoder method,and the features of low-flow customers and medium-high-flow customers are extracted respectively.According to the index data of the final entry model,we construct the model for the layered customers respectively,and compare the model effect of the three classifiers of logistic regression,random forest and XGBoost,and analyze the advantages and disadvantages of each of the three classifiers.In order to further improve the model effect,this paper uses stacking ensemble learning algorithm to combine different base classifiers,and compares the model effect before and after combination.Firstly,the model is constructed based on three basic classifiers:logistic regression,random forest and XGBoost.Then the output of the three basic classifiers is combined as the input of the second classifier.The second classifier in this paper uses logistic regression algorithm to obtain the final combined classifier by learning.By comparison,the accuracy and AUC value of the combined classifier are higher than those of each base classifier,the accuracy is close to 90%,and the AUC value is 93%.Finally,this paper summarizes the contents of main work,and based on the shortcomings,it puts forward the prospect for further research.
Keywords/Search Tags:Customer Churn, Flow, Layering, Autoencoder, Stacking Ensemble Learning
PDF Full Text Request
Related items