Font Size: a A A

Personal Credit Score Based On Internet Financial Data Model Research And Application

Posted on:2021-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:H C ZhangFull Text:PDF
GTID:2439330623459012Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
The development of the Internet is affecting the lives of a large number of people in the country.Among them,the development of Internet finance is particularly fast,but a huge challenge currently facing this part of the industry is how to assess the status of personal credit risk.Traditional credit risk assessment relies heavily on traditional central bank data.At present,the number of people with credit records in the central bank's credit reporting system accounts for only 20% of the country's total.Big data on the Internet has the advantage of wide coverage.If a multi-dimensional user portrait can be constructed based on user behavior characteristics,it can provide credit evaluation for users who cannot be supported by the central bank's credit reporting system.There are also some difficulties in the application of big data in credit evaluation,that is,the unique characteristics of data from different sources vary widely,mainly due to the following characteristics of big data: 1,low data quality,2,wide coverage,3,The correlation of individual variables is low.Unlike the strong correlation variables used in traditional risk models,big data are basically weak correlation variables,so there are higher requirements for the accuracy of the model.The construction logic of the traditional credit risk assessment model is to use the opinions of industry elites to construct a credit reference model,and then use the simple statistical model method to obtain the final result.However,in the new big data scenario,the original method is no longer due to its high dimensions.Applicable,requires a new set of solutions.Aiming at the problem of high data dimension and feature sparseness in the real Internet financial customer credit scoring business,this paper uses the idea of grouping modeling,proposes a feature selection method based on IV values and a weighted average model based on logistic regression,random forest,and Catboost..The study found that grouping modeling reduced the feature sparseness problem,improved IV values showed the influence of each feature on the results and could indicate its mutation threshold.The accuracy of the weighted average model was higher than that of each single model.In this paper,the model effect is verified on the desensitization data provided by a domestic financial institution.It is found that the AUC of direct modeling is 0.56,and the AUC of LRC modeling method is 0.74.
Keywords/Search Tags:group modeling, IV value, weighted average model, credit risk
PDF Full Text Request
Related items