| P2P is rising rapidly with the development of Internet finance,but the extremely high default rate has made investors pay a huge price.The existing risk control strategies have shortcomings such as inaccurate credit assessment and default identification,and blind investment decisions.From the perspective of investors,this thesis studies how to select appropriate loan projects and how to allocate investment funds rationally.Using the data of Lending Club as the experimental object,risk control is carried out from the perspectives of binary classifications and multiple classifications respectively.Firstly,A credit evaluation index system with non redundant information and significant default identification ability is constructed,which is to prepare for the default identification of P2P lending.According to the credit 5C analysis method,the first level index layer of the credit evaluation index system is established,and the two-stage index screening is carried out by combining the IV-WOE framework with Spearman-Boruta algorithm.36 indexes are selected,the final credit evaluation index system is established,and its corresponding relationship with the credit 5C standard is given.This not only avoids the problem of artificial subjective deletion,but also ensures that the selected indicators have strong ability of default identification.Secondly,from the perspective of binary classification,the risk control of online lending investment is carried out,the default identification model of online lending based on Stacking ensemble learning is proposed.According to the default sample ratio,the training set and testing set are divided,nine single classifiers are trained,and four better classifiers are selected as the base classifier.Taking ROC-AUC as the performance evaluation standard,five cross validation training were used for 300 times,and random search was carried out to adjust the parameters.Finally,with ANN,RF,AdaBoost and XGBoost as the base classifier,LR as the meta-classifier,the Stacking ensemble classification model is constructed to realize the default identification of online lending.The comparison results show that the Stacking ensemble。classifier combines the advantages of each single classifier,has a stronger default identification ability,can assist investors in selecting suitable lending projects,so it is a default identification model that can better meet the needs of online lending business.Finally,from the perspective of multiple classification,the risk control of online lending investment is carried out,an improved portfolio model based on multi classification expected rate of return matrix is proposed.Considering the credit misclassification loss in lending,a multi classification expected rate of return matrix is constructed.The default probability difference is used to measure the proximity of loans,the weight is determined according to the Gaussian kernel function,the weighted average of similar historical loans is carried out,and the expected return and risk rate of new loans are quantified.Finally,the online loan investment is transformed into a quadratic programming problem,and an improved portfolio model is proposed.Six groups of experiments are designed based on Lending Club to verify the risk control effect of the improved portfolio model in online lending.The results show that for different parameter settings,the optimized risk rate of the improved portfolio model is always less than the original risk rate.The model can not only ensure the minimum rate of return,but also assist investors to allocate investment funds reasonably,which has practical value in the field of risk control of online lending investment. |