Font Size: a A A

Research On The Prediction Of Individual Credit Card Overdue

Posted on:2020-10-01Degree:MasterType:Thesis
Country:ChinaCandidate:J ZengFull Text:PDF
GTID:2439330596493447Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
With the rapid development of China's economy,more and more young people are pursuing ahead-of-time consumption patterns,and the use of personal credit cards is becoming more and more frequent,which causes the problem of overdue repayment and the credit risk crisis of commercial banks.In order to control the credit risk,commercial banks need to identify overdue customers,reduce the credit lines of these customers,and even suspend their credit cards.Based on the data of personal identity information and property status,cardholder information,transaction information,loan information,repayment information and loan application information,this paper studies the problem of overdue prediction of personal credit card by using statistical methods.The following three aspects of work have been done:The data obtained in this paper are categorical imbalance data,that is to say,only a few samples of overdue customers,most of them are still overdue customers.Most domestic literatures use random down-sampling,random over-sampling and Smote to deal with unbalanced data.By analyzing the advantages and disadvantages of random down-sampling,random over-sampling and Smote sampling,in order to avoid the shortcomings of these three methods,this paper adopts multiple down-sampling methods to generate multi-class balanced data sets for integrated learning of prediction models.Paper data obtained nearly 200 variables.In order to improve the classification performance of the prediction model and reduce the running time of the program,this paper studies the problem of variable selection.The main variable selection method in domestic and foreign research literature is principal component analysis,but the limitation of this method is that the interpretation of principal component is more difficult.Through discussing the advantages and disadvantages of traditional variable selection method and Relief algorithm,this paper adopts Relief algorithm to select variables.In the selection of forecasting models,the domestic and foreign literatures mainly use logistic regression and random forest for bank data analysis.This paper compares and analyses the performance of Logical Regression,Decision Tree and Support Vector Machine,and finally chooses Logical Regression Model as the prediction model.When using test set to evaluate the performance of Logical Regression Model,its performance is the best.Finally,the reasons for the overdue are analyzed.
Keywords/Search Tags:Overdue credit card, Prediction, Category imbalance, Relief, Logistic regression
PDF Full Text Request
Related items