Font Size: a A A

P2P Online Lending Personal Credit Assessment

Posted on:2020-07-30Degree:MasterType:Thesis
Country:ChinaCandidate:C YuFull Text:PDF
GTID:2439330578453314Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
With the rise of Internet finance,P2P online lending successfully combines tra-ditional finance and Internet,and attracts a large number of investors and borrowers'attention by virtue of its characteristics of high yield,fast process and simple opera-tion.P2P online lending platform first appeared in foreign countries,and it began to enter China in 2007.China's P2P online lending platforms are developing rapidly,but the lack of effective regulatory measures lea.ds to more problematic platforms every year.which have an extremely adverse impact on investors,borrowers and the online lending market.Therefore,how to improve the risk identification and control ability of P2P online lending platform and reduce the investment risk of investors is crucial for the benign development of P2P online lending platform.In this paper,the traditional statistical model and machine learning model are used to evaluate the credit of borrowers.Empirical analysis was conducted on the information data of 150346 borrowers in the whole year of 2017 and the first three quarters of 2018 on the platform of Lending Club in the United States.First of all,this paper statistically describes the borrower s working years,real estate situation,credit rating and other information,and obtains the basic portrait of the borrow-ers,and qualitatively analyzes the default risk of the borrowers.Then quantitative analysis is carried out on the loan data to establish a default prediction model.In the modeling process,if the data is not balanced,and the modeling is performed without any processing,the result shows that the recognition accuracy of the model for the default user is only 30%.In order to improve the recognition ability of de-fault users,this paper uses the improved SMOTE algorithm Borderline-SMOTE to process the unbalanced training data to obtain a relatively balanced training data set.The model trained by the processed data improves the recognition accuracy of default users to 63%.In this paper,the GBDT-Logistic regression model was first applied in the field of financial risk control,and was compared with the Logistic regression model and the random forest model.The results showed that the ran-domforest had the best prediction effect.At the same time,its time cost is also highest.The time complexity of Logistic regression is the lowest,which has a great advantage in mass data,but the prediction effect of the model is the worst,while the GBDT-Logistic regression achieves a good balance in prediction accuracy and time complexity,achieving the effect of high prediction accuracy and moderate time cost.
Keywords/Search Tags:P2P online lending, Personal credit assessment, User portrait, Un-balanced data
PDF Full Text Request
Related items