Font Size: a A A

Research On Credit Risk Control Based On Smoothed Regularized Logistic Regression

Posted on:2021-12-28Degree:MasterType:Thesis
Country:ChinaCandidate:C N ChenFull Text:PDF
GTID:2510306302976219Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the increase of consumption demand and the formation of advanced consumption awareness,a large number of demand for small and decentralized loans have been generated,but the traditional credit business of banks can not cover these needs.Internet credit is rapidly spreading in the market because of its loan convenience,but unlike traditional credit,which has mortgage and guarantee,it only makes loan approval decisions based on the credit investigation of lenders.Due to the nonstandard development of the Internet credit market and the lack of relevant laws,the risk of Internet credit is very high,and risk control has become a key problem faced by credit institutions.As credit institutions only lend to approved loan applications(accepted samples),they can only obtain default labels of these samples.However,accepted samples are only a partial subset of the overall application samples,thus in the process of model iteration,if we model only based on accepted samples,the model will overfit to the accepted samples,making the prediction accuracy of credit risk control model for future application samples facing challenges.In the theoretical research and practice of credit risk control model,reject inference technology is generally used to reduce the overfitting risk of the model,that is,add the rejected samples with inferred default labels to the training samples for model learning.However,the performance of reject inference model depends on the accuracy of reject inference technology,if the effect of reject inference technology is not good,then the performance of reject inference model is likely to decline.This paper goes beyond the traditional solution to the credit risk control problem and corrects the sample selection bias from a new perspective: in view of that the default mode of customers will not change much in a short period of time,and the default judgment rules should not change much,so this paper starts from controlling the degree of change of the credit risk control model in the iterative optimization process to avoid the overfitting risk of the model.In this paper,the regularization technology is introduced to improve the traditional logistic regression algorithm,and the smoothing regularization term constructed is based on the classical two norm regularization technology,it makes the model iteration smoothed by controlling the differences between the new model and the old to reduce the risk of model overfitting.In addition,the smoothing intensity is controlled by the coefficient of smooth regularization term.Considering the distribution difference between the accepted samples and rejected samples,the smoothing intensity is also different for the accepted samples and rejected samples.Therefore,this paper adopts differential modeling,that is,modeling and testing respectively for the accepted samples and rejected samples.Based on the real loan data set provided by rong360 platform,this paper makes an empirical analysis.The experimental results show that the smooth regularization item constructed in this paper can encourage the iterative smoothing of credit risk control model,and verify that the performance of the model obtained by smoothing encouragement is better than that obtained by not smoothing encouragement.In addition,the algorithm has different promotion rules for different types of test sets,which verifies the rationality of differential modeling and testing.The algorithm in this paper adjusts the coefficient of the smooth regularization term for the accepted samples based on the verification set of the accepted samples,and adjusts the coefficient of the smooth regularization term for the rejected samples based on the verification set of the accepted samples after importance sampling.By comparing the model performance of the algorithm in this paper,traditional logistic regression algorithm and common reject inference methods(hard cut-off,fuzzy augmentation,parcelling and reweighting),it is found that the performance of the algorithm in this paper is the best,which has a greater improvement compared with the traditional logistic regression;the reject inference methods can get a greater improvement in rejected samples,but a smaller improvement for accepted samples,while the algorithm in this paper can improve greatly both in accepted and rejected samples.In this paper,we also try to fuse the smooth regularization term with the classical two norm regularization term.The experimental results show that the two norm regularization term can indeed improve the performance of traditional logistic regression.At the same time,we can use the two norm regularization term and the smooth regularization term proposed in this paper to further improve the performance of the model,which shows that the algorithm in this paper has the potential to achieve greater improvement by integrating with traditional classical methods.The idea of iterative smoothing of credit risk control model proposed in this paper is general.Based on this idea,this paper improves the traditional logistic regression algorithm,which can also be transferred to other classification algorithms.
Keywords/Search Tags:credit risk control, overfitting, regularization, logistic regression, reject inference
PDF Full Text Request
Related items