Font Size: a A A

The Personal Credit Risk Assessment Of P2P Platform Is Based On Random Forest Model

Posted on:2019-12-02Degree:MasterType:Thesis
Country:ChinaCandidate:C YanFull Text:PDF
GTID:2439330563996486Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
In recent years,P2 P network lending platforms have sprung up.As an innovative internet finance model,it has lots of advantages for us.For example,it's convenient and quickly for us to online lending,and it has features of the high return rate on investment and short payback period,which can effectively solves the problem of hard financing for SMEs and individuals.However,There are lots of hidden concerns behind the crazy growth of the number of platforms.For instance,there are numerous risks for investors' funds because of many reasons,including the short development time,the lagging of personal credit rating technology,the complete legal regulatory system has not yet been established and the inability to accurately assess the borrower's credit risk has become an important technological bottleneck affecting the development of the P2 P platform.So how to establish a comprehensive credit risk assessment system is the key to the sustainable development of each company.Based on this situation,in this paper,we through comparing and analyzing the accuracy and stability of models for personal credit risk assessment,then establish a weighted random forest model for more accurately to predict whether an individual will default than other models.The first step is acquire and process data.In this step,we use the Python to crawl Renrendai data on the domestic P2 P platform.The main feature variables including the basic information of borrower,the basic loan information and so on.Through observation we can find some variables are unique in value or is less useful for the training of model.So we choose the more important feature variables through five-fold cross-validation method.The second step is use the random forest(RF)model to classify individual credits,which are divided into defaults and non-defaults.Compared to the traditional single classifier model,the combined Classifier Model such as RF model has better stability,It's rarely possible to produce overfitting,and it also can improve the classification accuracy of the model.So in this step,we introduce the basic principles of RF model.Based on this,we came up with weighted Random Forest Model and introduce the cost-sensitive learning method in order to improve the prediction accuracy of negative samples so that the model is more suitable for P2 P data.The final step is classify and predict the individual credit through the weighted random forest model,and compare it with the traditional credit risk assessment model,we have the following conclusions.On the one hand,the weighted random forest model is more stable and has higher classification accuracy than normal model.On the other hand,we deal with the training data through the SMOTE method because of the fact that the relatively small negative sample datas,and then increased the number of negative class samples in order to we have more references during training model,thereby improving the prediction accuracy and practicality of the negative sample by the model.In the end of this paper,we compare and analyze the differences in the selection of P2 P platform indicators at China and abroad,we can found that the characteristics of foreign platforms were more focused on the basic conditions of lenders,while domestic platforms were more concerned with the basic information of loans.But in this paper,we concentrate on set different weights for different feature variables in this model,so this model is more suitable for domestic P2 P platforms than foreign platforms.
Keywords/Search Tags:Credit Rssessment, Weighted Random Forest(WRF), Feature Comparison, SMOTE Method
PDF Full Text Request
Related items