Font Size: a A A

Application Of Sparse Logistic Regression In Bank Credit Business

Posted on:2021-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:L WangFull Text:PDF
GTID:2480306248455844Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
In recent years,personal consumer credit has developed rapidly,but there are huge crises and risks behind it.Therefore,establishing a perfect personal credit evaluation index system,constructing a more reasonable and effective classification model,improving the prediction accuracy of potential default users,and minimizing the loss as much as possible have become the core problems of banks and many financial institutions.This thesis first introduces some theoretical knowledge of sparse learning in detail,including the variable selection method of the sparse model and the optimization algorithm of the objective function.Among them,variable selection methods can be divided into variable selection methods based on penalty functions,Dantzig Selector and its derivative methods,and SIS methods to solve the selection of ultra-high dimensional data variables.The optimization algorithms of objective function include Bregman algorithm,minimum angle regression algorithm and coordinate optimization algorithm.Then we introduced the basic theory of the logistic regression model commonly used in personal credit scores,and on this basis,three models of Lasso-Logistic regression,MCP-Logistic regression and SCAD-Logistic regression are established to evaluate personal credit.We construct simulation data,and illustrate the superiority of non-convex regularization terms MCP and SCAD in the model through numerical experiments.Finally,we use real personal credit data for empirical analysis,and use the prediction accuracy as a standard to compare the three models established above.The experimental results show that the correlation model of logistic regression has good accuracy and robustness for the prediction of credit default customers.In this example,the prediction accuracy of Lasso-Logistic regression and MCP-Logistic regression is about 74%,and the prediction accuracy of SCAD-Logistic regression is 75.5%.By analyzing the regression coefficients in the model,we can get that,among the 21 explanatory variables after feature construction,account status,loan installment time,credit history,loan amount,payment amount to income ratio,loan purpose,marital status,the nature of work and whether the customer is a foreign worker have a significant impact on the model.In other words,these variables have the most significant impact on whether personal credit loans default,banks and major financial institutions should focus on these variables.
Keywords/Search Tags:sparse logistic regression, personal credit score, MCP and SCAD regularization, risk control
PDF Full Text Request
Related items