Font Size: a A A

Research And Application Of Stacked Credit Scoring Model Based On Petty Loan

Posted on:2019-06-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y ShenFull Text:PDF
GTID:2359330548953993Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
In recent years,with the emergence of professional credit institutions and services,credit risk management has become a hot issue in the financial field.Pretty loan also faces opportunities for innovation and development.The scientific and effective evaluation of personal credit status is of great practical significance for reducing credit risk and establishing and perfecting the credit market.In the management of credit risk,credit score is the most important quantitative analysis method to measure the size of risk,and it is an important basis for credit decision making.Therefore,the credit scoring model,as the core content of credit scoring,has been developed and put into operation by commercial banks and credit institutions at home and abroad,and has become an important research topic in the credit industry.On this issue,based on Lending Club's loan data of a 3 years' financing cycle from 2007 to 2015,this paper makes a further exploration and research by using the statistical learning method.First of all,we selected the loan data of LC,which has 74 variables and 620 thousand loan records.We initially select 200 thousand data.Their states are ‘Fully Paid',‘Late(16-30 days)',‘Does not meet the credit policy.Status: Fully Paid',‘charged off',‘Late(31-120 days)' and ‘Does not meet the credit policy.Status: Charged Off'.We delete the variable if it has more than 80% missing values or its single value accounts for more than 95%.We digitize the text,dump code processing for stereotyped variables,and converse date.Through the above operations,we sort out the data,and finally get 27 variables.Secondly,we use a partition method called WOE(weight of evidence)to discretize the continuous variables.After the discretization of the variables,the stability of the model is enhanced,and the model is easier to iterate.Because it has the function of simplifying the model,it can reduce the risk of over fitting.Using the IV value of the variables,we screen out 13 important variables such as total repayment and home_ownership,then we set up the model index system.Thirdly,by comparing machine learning models such as decision tree,gradient boosted decision tree and back propagation neural network,three single models were finally selected.They are traditional score card model,eXtreme Gradient Boosting and deep neural network,and they study and score on their own.On this basis,we combine the traditional method with the other two relatively new models,let them play their own advantages and build up a stack model.Through the gradual tuning,the model is further optimized,the parameter estimation is more accurate,the credit score error of the model is smaller,and the accuracy rate is 2% higher than the single model.Finally,We use the results of the stack model to predict the credit score.We build a scoring model according to the default probability of the stack model,and get the corresponding credit scores of each customer.Through the comparison of the credit score and the actual default,we verify the reliability of the results.By analyzing and comparing credit scores and default probability,we give corresponding suggestions to borrowers and lenders respectively.At present,XGBoost and deep neural network are not much used in the credit scoring field.Based on the small loan data,this paper,based on the data of pretty loan,uses logistic regression and the above two methods to establish credit scoring models,and build the stack model on the basis of them to make some improvements to the current method.
Keywords/Search Tags:Credit Score, Logistic Regression, XGBoost, Neural Network, Stacked Model
PDF Full Text Request
Related items