Font Size: a A A

Research On Customer Churn Forecast And Influencing Factors Of Communication Operators

Posted on:2021-01-31Degree:MasterType:Thesis
Country:ChinaCandidate:T XinFull Text:PDF
GTID:2439330611462870Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
With the development of Internet technology and the increasingly saturated situation of the communication industry,predicting customer loss and retaining customers in advance is a key link for the development of the communication industry.At present,China has just entered the 5G era,the communications industry is facing fierce competition,product assimilation and many other problems,which lead to serious customer loss in operators in the communications industry.In view of this phenomenon,it is of great significance to predict the customers with loss tendency and find out the influencing factors from the massive customer information and consumption data,construct a relatively complete loss system,and give reasonable retention suggestions pertinently.It is of great significance to provide accurate services for customers and create more revenue for enterprises.In chapter 1,the research background and significance of an operator in the communication industry are introduced.Starting from the business income,usage and user scale of operators in the communication industry,this paper understands the development trend of the industry,and expounds the significance of the research of this paper to the enterprise income of operators under the development of emerging technology and Internet.And the domestic and foreign scholars' research methods of customer churn and the development of machine learning methods are analyzed.In chapter 2,its main content is to make an exploratory analysis of the customer data of an operator.First of all,the data sources are introduced.Secondly,the data preprocessing includes the processing of missing values and the transformation of variables.Finally,the data is visualized to view the basic information characteristics of the data.In chapter 3,16 important independent variables are screened by box diagram method,Spearman correlation coefficient method,hypothesis test,ID3 algorithm based on decision tree and SVM-L1 algorithm.There are package monthly fee,telephone number level,4G non-online billing,4G online billing,contract period,non-contract period,number of months online,billing time,caller billing time,Internet traffic,3G TV fee,fee 2,Internet access fee,due fee,contract period and giveaway phone fee.In chapter 4,the customer churn prediction model is established by using Boosting algorithm and survival analysis method.First of all,the customer churn prediction models of AdaBoost,GBDT,XGBoost,LightBGM and CatBoost are established,and the CatBoost model is selected as the optimal prediction model.Then,in order to understand the time and risk factors of customer churn,a survival analysis model is established.Kmurm method is used to draw the cumulative survival function diagram of discrete variables to directly show the impact of these discrete variables on customer survival rate.The Cox proportional hazard regression model is established for all variables,and the risk factors affecting customer loss are analyzed,including package charges,some telecom types,receivables and gift charges.When the customer is under the influence of many factors,it is found that when the customer has been using it for more than 200 months,the survival probability remains about 0.4,reaching the lowest point.Then,the CatBoost algorithm is improved based on the survival analysis model,and the survival probability predicted in the survival analysis algorithm is substituted into the sample,and predicted by CatBoost,its accuracy,recall,accuracy and F1 score are all about 0.96.the prediction effect of the improvement is not very significant compared with the CatBoost model,but the improved model can provide the risk factors affecting customer churn and the survival probability of customers.It is helpful to understand the time and reason of customer loss in more detail.In chapter 5,the variables that predict the loss of customers are clustered by using the system clustering method,and it is found that the factors that affect the loss of customers are Internet traffic,online time,receivable expence and other comprehensive factors.Combined with the previous research and analysis,the operators give corresponding suggestions to retain customers to help enterprises develop better.In chapter 6,the conclusions and prospects of the research are given.First of all,it is summarized the full text and given the conclusion of the paper.Secondly,it is explained the three shortcomings of data collection,algorithm improvement and customer value mining in the research,and looks forward to the future work.
Keywords/Search Tags:Customer churn, Machine learning, Boosting algorithm, Survival analysis
PDF Full Text Request
Related items