Font Size: a A A

Research On Personal Credit Score Modedl Of Internetcredit Based On LightGBM-Logistic Regression

Posted on:2019-03-26Degree:MasterType:Thesis
Country:ChinaCandidate:W ZhouFull Text:PDF
GTID:2439330575950419Subject:Quantitative Economics
Abstract/Summary:PDF Full Text Request
In the context of the development of the Internet and big data technology,the Internet finance industry has been in China for more than a decade.As a sub-category of the industry,the online loan industry has undergone rapid development,policy supervision,and industry reshuffle since its introduction in 2007,and has gradually stabilized now.As China's credit information system is still not perfect,in the face of the massive data of the Internet,how to accurately assess the personal credit status and effectively control the credit default risk has always been a key issue in the industry.Under the current strict industry policy background,establishing a more comprehensive online loan risk control system is of great significance to promote the sound development of the industry and the reform and innovation of science and technology finance.Based on the high accuracy of the machine learning algorithm and the interpretability of the linear model,this paper combines the new open source efficient algorithm LightGBM from MSRA in 2016 and Logistic regression model:firstly,the IV value is calculated by the feature variable.The logistic regression model was constructed by selecting the variables with better distinguishing ability,and the remaining un-parameter variables were modeled by LightGBM algorithm,and the results obtained by the algorithm were added as explanatory variables to the original logistic regression model.LightGBM-Logistic regression model.In the empirical process,firstly based on the real transaction data of a domestic online lending platform,it is found that the LightGBM-Logistic regression model is superior to the single logistic regression model in prediction accuracy.The model was then applied to the data set of LendingClub from the first quarter of 2007 to the second quarter of 2017.Experiments show that the model is more interpretable in the results.Faced with the characteristics of wide coverage,sparseness and weak single-variable interpretation of Internet data,it is crucial to accurately and efficiently extract the value related to its own business from the data.This paper has made a useful attempt in the field of online credit personal credit risk.
Keywords/Search Tags:Credit rating, LightGBM algorithm, Logistic regression model
PDF Full Text Request
Related items