Font Size: a A A

Research On Post-Loan Risk Warning Based On Unbalanced Three-Classification LGBM Model

Posted on:2020-10-02Degree:MasterType:Thesis
Country:ChinaCandidate:X P WangFull Text:PDF
GTID:2439330596986787Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
In recent years,big data and Internet finance have developed rapidly.As an important part of Internet finance,P2 P industry has an advantage of being more convenient than traditional bank credit business.Applying data mining technology to prevent financial risks is an important topic at present.In this paper,the post-loan risk warning is taken as the research background and the borrower is divided into three categories according to the repayment situation,namely the implementer,the follower and the defaulter.After cleaning the data by feature selector,Xgboost(XGB)and RandomForest(RF)algorithm are used to select features.It is found that repayment schedule,loan cost,debt paying ability and external credit extension are four important factors affecting the results of the repayment.Interest rate is the key factor in the credit.The most significant factor affecting the interest rate is the borrower's credit rating obtained by the regression model with dummy variables.In the early warning model,cross validation,learning curve analysis and statistical test are used to compare 6 single models and 5integration models based on decision tree.It is found that the integrated models have obvious advantages over single models in classification performance and the LightGBM(LGBM)model has the best performance.In order to solve the impact of unbalanced data on model results,the model is optimized from three aspects of data disturbance,parameter disturbance and characteristic disturbance.Finally,it is found that the model is improved compared with other models under the two evaluation criteria of F1_macro score and Recall of defaulters.Particularly,Recall of defaulters is significantly improved.
Keywords/Search Tags:post-loan risk warning, three-classification, feature selection, unbalanced data, LGBM
PDF Full Text Request
Related items