Font Size: a A A

Prediction Of Bank Credit Card Customer Churn Based On Logistic Regression And XGBoost

Posted on:2022-01-31Degree:MasterType:Thesis
Country:ChinaCandidate:M D ZhangFull Text:PDF
GTID:2480306311475904Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
After more than ten years of rapid development,the competition in the bank credit card market has been very fierce.Many banks blindly pursue the increase of credit card issuance and ignore the maintenance of existing customers,which leads to the continuous increase of "dormant cards" and the phenomenon of customer loss from time to time,which is likely to become a hidden danger to the stable development of the banking industry.The banking industry to change its business philosophy and improve the loyalty of credit card customers is an effective means and direction for the development of the whole industry.Therefore,it is of great significance for the healthy development of the banking industry to establish a model to predict the loss of credit card customers and identify customers with loss tendency in time,so as to better maintain customer resources.At present,many different kinds of models have emerged in the field of credit card customer churn prediction,but most experts and scholars either carry out research on a single algorithm to train the forecasting model,or let several trained single models predict customer churn through voting,few experts and scholars study it from the perspective of combination model.The research of this paper is based on the bank credit card customer data released by Kaggle platform in November 2010.In the process of data preprocessing,the unknown values under the classification variables are regarded as a category,and each classification variable is coded,and then a simple data visualization is carried out to study the relationship between some variables and customer churn.In this paper,two algorithms,Logistic regression and XGBoost,are used to build a single model.First of all,this paper summarizes the theoretical knowledge of the two algorithms,and then constructs the Logistic regression model and XGBoost model to predict the loss of credit card customers based on the same training data,and uses these two models to predict the test set.Comparing the performance of the two models in the test set,it is found that the XGBoost model can better identify the loss of credit card customers.In this paper,Stacking ensemble learning method is used to integrate Logistic regression model and XGBoost model,and two different model combinations are adopted to construct two combination models with different structures.Specifically,the base classifier of the first layer of the first combination model selects the XGBoost model and the Logistic regression model,the meta-classifier of the second layer uses the Logistic regression model,the base classifier of the first layer of the second combination model selects the XGBoost model,and the meta-classifier of the second layer uses the Logistic regression model.The results show that the second combination model is relatively simple in structure and performs better in the test set.Through the research,this paper finds that when building a model to predict credit card customer churn,the performance of the combination model constructed by Stacking ensemble learning method on the test set is better than that of the two single models.In the two combination models,the combination model with simpler structure performs better on the test set,which shows that the combination model is not the more complex the better.For the prediction of customer churn,we can also choose other kinds of single model and other different model combination,which is a new research idea in the field of customer churn prediction.
Keywords/Search Tags:Bank credit card, Churn prediction, Logistic regression, XGBoost, Stacking
PDF Full Text Request
Related items