Font Size: a A A

Statistical Analysis Of Personal Credit Default Risk Prediction

Posted on:2021-02-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y ChenFull Text:PDF
GTID:2430330602998147Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
With the continuous development of China's economy,more and more people are beginning to accept the advanced consumption mode of credit loans,and the proportion of personal credit loan business in bank loan business is rising.P2P(Peer-to-Peer) and other Internet finance companies also Flourish.Banks and other Internet financial institutions urgently need to take reasonable and effective measures to avoid the credit risk brought about by the continuous expansion of business scale.This article predicts and classifies credit default risk based on real user historical transaction data sets provided by Home Credit.First,visually analyze the data,understand the distribution of data and variable types,and perform data cleaning,feature engineering,etc.Then use the Logistic model,Random Forests model and Boosting integration in the Bagging integration algorithm The leading Lightweight Gradient Booster(LightGBM) model in the algorithm predicts the user's credit default risk.Aiming at the problem of unbalanced data,this paper uses SMOTEENN comprehensive sampling method for sampling.Finally,the evaluation and analysis of the model through indicators such as recall rate,accuracy rate and AUC found that the performance of random forest and Light GBM are better than logistic regression,and the three indicators of Light GBM model are all above 0.95.
Keywords/Search Tags:personal credit default risk, unbalanced data, Logistic regression, random forest, LightGBM
PDF Full Text Request
Related items