Font Size: a A A

Credit Rating For Online Lending Based On Cost-sensitive Classification And Ensemble Learning

Posted on:2021-04-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:H M WangFull Text:PDF
GTID:1360330626455762Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Credit rating is essential for financial institutions to reduce risks and increase returns,and is one of the key technologies in financial industry.With the advent of big data era,data mining algorithms have become the most popular method for credit rating.From the perspective of classification tasks,traditional credit rating is usually described as a two-class classification problem,where the loans are distinguished as either good or bad.It is effective for bank lending,because the interest rates of loans are relatively fixed and the main objective of the bank is to determine whether to provide the loans.However,online lending is mostly non-mortgaged and heavily relies on credit rating.In order to control credit risk,loans in online lending are usually divided into multiple credit grades according to borrower's credit level and probability of default,so that differentiated interest rates are set for loans of each credit grades.Credit rating in online lending not only require achieving high accuracy of multi-class classification algorithms,but also need to consider misclassification costs and ordinal relationship between classes.The characteristics of online lending raise problems and challenges to credit rating methods.In response to the demands,this dissertation models multiple credit grades as multiple classes in classification.Based on cost-sensitive classification and ensemble learning,we propose credit rating methods that consider the actual needs of online lending,so as to improve accuracy and reduce misclassification losses.Specifically,the research contents of this dissertation include:First,in online lending,misclassification between different credit grades will cause losses in varying degrees.It is one of the key issues in cost-sensitive classification to reflect misclassification costs.This dissertation proposes a method to measure misclassification costs for credit rating in online lending.The lenders' loss of return and opportunity cost,due to misclassifying loans' credit grades,are regarded as misclassification costs and constituted a cost matrix,which is combined with cost-sensitive multi-class classification algorithms to reflect the costs.In addition,this dissertation determines the ranges of parameters using parameter analysis that considers both properties of the model and actual business background.Sensitivity analysis verifies that the measurement method is robust.Second,it is another important direction to evaluate cost-sensitive classification algorithms and choose the most suitable one for credit rating in online lending.The measurement of cost matrix is used to construct MetaCost algorithms which are cost-sensitive,and to calculate a criterion,total cost,for evaluating the performance of the algorithms.By experiments,we evaluated and compared the performances of the MetaCost algorithms with ten underlying classifiers on ten different feature spaces,so that a best way is selected to achieve cost-sensitive classification for credit rating in online lending.Third,ensemble learning is one of approaches to improve classification accuracy.Traditional ensemble learning methods typically ignore ordinal relationship between multiple classes,which is very important for credit rating in online lending.Credit grades refer to creditworthiness of borrowers.Misclassification cost between two neighboring grades is smaller than that between two distant grades.This dissertation proposed an ordinal ensemble learning method based on pairwise comparison.Class labels predicted by base classifiers are converted into a pairwise comparison matrix between the samples.A priority vector is derived from the pairwise comparison matrix,which is used to sort and classify the samples to improve the accuracy.Fourth,since the scale of real-world data in online lending is much larger than traditional pairwise comparison problems,existing prioritization methods are inefficient.This dissertation proposes a Bipartite Graph Iterative Method(BGIM)to derive priority vector efficiently from large-scale pairwise comparison matrixes,and theoretically proves the convergence and error bound.As shown in the numerical examples and simulation experiments,the proposed method can not only derive a reliable priority vector but also improve the calculation efficiency.The real loan data collected from an online lending platform,Lending Club,is taken as an example to demonstrate the proposed methods.The experimental results shown that the proposed methods,i.e.the cost-sensitive multi-class classification method and the ordinal ensemble learning method based on pairwise comparison matrix,provide effective ways for credit rating in online lending.Specifically,we calculate a misclassification cost matrix between the seven credit grades,found a subset of features with the best separability,pointed out that the MetaCost algorithm with Back Propagation neural network as underlying classifiers performs the best.Further,the proposed ordinal ensemble learning,in which MetaCost BP neural network is used as base classifiers,can achieve prediction with 76.10% accuracy and 0.0029 average misclassification cost,which are better than the results of single classifier and two popular ensemble learning methods,i.e.Bagging and Adaboost.In summary,from the perspective of cost-sensitive multi-class classification algorithms and ensemble learning,this dissertation provides methods and tools for credit rating in online lending,which is helpful for online lending platforms to improve their profitability and risk management.
Keywords/Search Tags:credit rating, data mining, cost-sensitive classification, ensemble learning, pairwise comparison matrix
PDF Full Text Request
Related items