Font Size: a A A

The Personal Credit Evaluation Model Of Online Lending Based On Unbalance Samples

Posted on:2019-02-23Degree:MasterType:Thesis
Country:ChinaCandidate:L WangFull Text:PDF
GTID:2417330596463503Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
The online lending industry has promoted Internet finance industry with the help of ever-changing Internet technologies.However,at present,although the industry is developing very rapidly,the problems are even more frequently exposed.Default risk by borrowers is high in online lending,which had a negative influence on online lending transaction.In order to study the problem of borrower default in online lending,this thesis applies Python software to get access to the real transaction data of an Internet finance company and select the observation indicators closely related to the borrower's credit rating.Then establish a cross-distribution table of each indicator and credit rating to score each indicator in order to construct a network credit evaluation index system.Based on the quantitative marking table,the original data is quantified and the data missing value was interpolated with the EM algorithm.After constructing the index system,this thesis proposes an improved SMOTE algo-rithm in view of the unbalanced problem which is there are more positive samples and less negative samples in the data.The SMOTE algorithms before and after the improvement are used to pre-process the initial unbalanced data set.Naive Bayes,neural network,K-nearest neighbor,support vector machine and decision tree are applied to classify the data sets pro-cessed by SMOTE algorithm.Geometric mean(G-mean)and area under the curve(AUC)are selected to test the classification effect.Through comparative study,it is found that the improved SMOTE algorithm has more obvious classification effect which shows that the improved SMOTE algorithm generates more scientific and reasonable samples.Therefore,it is concluded that the best classifier with the improved SMOTE algorithm is the decision tree model.Finally,by adopting the improved SMOTE algorithm to generate a small number of samples with poor credit rating and constructing a new balanced data set,the thesis firstly establishes a single decision tree model of pre-pruning and post-pruning based on the clas-sical CART algorithm of the decision tree model,and then combines the actual situation of network lending,constructing the loss matrix,and establishing a C5.0 decision tree model based on loss function optimization.Considering the possible instability of a single decision tree,the thesis builds a combined decision tree model based on the random forest algorith-m to optimize the integrated algorithm to improve the accuracy and stability of the model prediction.
Keywords/Search Tags:online lending, SMOTE algorithm, decision tree, random forests
PDF Full Text Request
Related items