Font Size: a A A

Analysis Of Credit Card User Default Prediction Based On Over-sampling Method

Posted on:2020-06-18Degree:MasterType:Thesis
Country:ChinaCandidate:J WangFull Text:PDF
GTID:2417330596470674Subject:Statistics
Abstract/Summary:PDF Full Text Request
With the rapid popularization of credit card business around the world,credit risk is also expanding rapidly.The emergence of a large number of credit card users' default behavior has brought great losses to banks and other financial institutions.Therefore,it is important to identify the credit risk and identify the default users in advance.It can provide decision-making basis for Banks and other financial institutions,to help it develop reasonable lending policies,reduce their own risks and promote their healthy development.Generally speaking,the distribution of each category in the credit card data set is extremely unbalanced.The number of non-defaulting people is relatively large,and the number of defaulters is relatively small,and the traditional artificial credit risk assessment model is no longer applicable.This paper will use data mining technology and machine learning methods to explore and analyze the credit card data sets from two aspects of data and model.The data set used in this paper is from Kaggle's official website.It is the historical consumption and default records of credit card users of a foreign bank from 2015 to 2017,and the distribution of the categories of the sample set is extremely unbalanced.Firstly,SMOTE algorithm and ADASYN algorithm will be used to over-sample the data sets respectively,so that the categories of the processed data sets are relatively balanced.The advantage of this approach is that the information of most classes in the samples will not be lost.Then,based on the over-sampled data,the prediction models such as Logistic Regression,Random Forest,Neural Network and XGBoost are established.By comparing the evaluation indicators of each model,we can find the optimal prediction model,which can identify the defaulting users to the greatest extent.Finally,through the analysis of the results of each model,we can find the main factors that affect the defaulting behavior of the card users.
Keywords/Search Tags:SMOTE algorithm, ADASYN algorithm, Classification model, Model evaluation, Credit card default forecast
PDF Full Text Request
Related items