Font Size: a A A

Research On Individual Credit Risk Evaluation Model Based On Big Data

Posted on:2017-07-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:W J ZhangFull Text:PDF
GTID:1319330512452616Subject:Industrial Economics
Abstract/Summary:PDF Full Text Request
The “Internet+” has been promoted to the national strategic height.The E-commerce sites,such Taobao.com,Tmall.com as JD.com,the social platform with the APP representative of WeChat and QQ,and the tool of on-line payment,for example,Alipay,WeChat Pay,etc.have gone deep into our daily life.The Internet financing services,including Internet payment,P2 P lending,crowd funding,on-line loans,and on-line wealth management products,present the development of a vibrant scene in our country.Internet and Big Data have produced profound influence in many areas and commercial modes of the national economy.It is crucial that how to access the risk of individual credit fully and accurately,and carry out the personalized financial services on this basis,not only for the commercial banks,small loan companies,but also other aspects of risk control of traditional financial institutions.It is also the key point during business processing for P2 P lending companies,and the other emerging Internet financial institutions.The rising rate of non-performing loans is forcing these financial institutions to continue to improve the level of risk management.The financial institutions rely too heavily on the system developed by the Credit Reference Center,the People's Bank of China,when assessing the individual credit risk.The number of the system is more than 8.6hundred million,but only 3 hundred million have been with the credit records,which mainly come from the commercial banks,rural credit cooperatives and other financial institutions,and have defects in the aspects of timeliness,comprehensiveness,hierarchy and other dimensions of the data.Big data provides a new method for personal credit risk assessment.It is making a complete view of the user,by analyzing the business reputation and behavior during the process of shopping,transaction,social intercourse over the Internet,integrating the information distributed in different platforms and credit institutions,and fully mining the user credit information.With these efforts,it is crucial important for theInternet finance platforms,small loan companies,and other institutions to convert the user business reputation and behavior into the basis of the credit rating,build the risk control model based on the Big Data.It is also provide an opportunity of the credit service from the financial institutes for the users with the fault credit records or missing data in the system developed by the People's Bank of China.There are unique characteristic different from the traditional credit information,for the data of the user consumption,social contacts,either online or offline.It is made the low-performing effects for the traditional individual credit risk assessment models and methods in Big Data applications.First,it is remarkable for the sparsity of the data,which makes extremely difficult to fully collect and cover because of the behavior online or offline spreading widely.The variety of the user behavior preferences also brings the large differences in different categories.Second,the data coverage is very wide.The number of the active users of WeChat or Alipay,is more than 4 millions.The behaviors cover all the aspects of clothing,books,housing,leisure,entertainment etc.,with more than 1000 dimensions of the single index.Last but not the lease,it is weak for the univariate to distinguish the risk.Different from the strong variable,such as historical performance record,and the assessment of personal assets,etc.,the variable of the consumption or social contact is weak to distinguish the risk.In the application of the traditional credit risk assessment,the models are developed by data-driven or with the experience of experts,and built on the based on the Logistic regression,discriminant analysis and other statistical analysis algorithms to achieve the accurate credit rating scores.However in the new business scenarios,the logic framework of the original business is lost,which limited the application of the traditional statistical analysis model.In recent years,it is developing rapidly in the field of machine learning,such as decision tree method,neural network theory,etc.,which has achieved perfect effects on the information identification,recommendation engine and other application aspects.It is a topic worthy of study on how to combine the traditional risk assessment models and the advanced machine learning theory to evaluate the risk more accurately,with the premise of business logic remained and the model widelyapplied.The research method of this paper is a useful attempt in this aspect.This paper mainly focused on the following aspects during the research on the model of individual credit risk evaluation model based on Big Data.1.The research framework of the individual credit risk assessment model in the Big Data environment,named CreditNet,is advanced after the data base,the performance of logic and definitions,classification of samples and sampling programs were discussed in detail.The infrastructure of CreditNet is divided into four levels,and the model will be implemented with specific studies of three stages.2.The concept of “user profile” is proposed for the first stage of CreditNet research.It solves the problems of how to collect and organize the information effectively in Big Data,find the potential attributes to the application related to the credit issues,with the discussion on the method of building the user profile,and description of the logical structure and technical framework of the system.The concept of user credit profile is also put forward with the list of the factors established.The method of the variable derived aims to enhance the ability of distinguishing the risk.It then gives a detailed description of the preprocessing steps and methods for the Big Data,including the aspects of data collection,data check,data cleaning,univariate analysis,and multivariable analysis,etc.which provided the data base for the research of this paper.3.The Logistic regression model is combined with the RandomForest theory to build the RF-L core model in order to deal with the second research stage of the CreditNet model.Namely,before the statistical modeling,it should build the Random Forest,and then analyze the single variable using the CHAID algorithm to generate the binary-variable of the decision tree.With the completion of those steps,the outputs of the Random Forest would be input into the Logistic regression to get the weight of the risk for each factor.The RF-L core model plays the advantages both random forest and Logistic regression model and lays the foundation for the establishment of individual credit risk assessment model in Big Data environment.4.The ensemble learning algorithms in the theory of machine learning is discussed for the third research stage of CreditNet model.After summarized theconcepts of the classifier ensemble,it introduces the principle,model and steps of AdaBoost algorithm in detail,to provide a methodological theory for the model ensemble of this paper.Thus the effect of the risk evaluation will be enhanced for this model.5.The CreditNet model will be implemented after the three stages mentioned above.The distinguish ability,stability and effects in contrast with other models are analyzed to verify the advantage of the CreditNet in the paper.The model is applied to the credit card business of a joint-stock commercial bank and a P2 P lending company and gets a satisfied result.It summarizes the application scenarios related to the model from three aspects: the credit approval automated,credit investigation diversified,and risk monitoring and early warning.
Keywords/Search Tags:Big Data, Individual Credit Risk, Evaluation Model, Random Forest
PDF Full Text Request
Related items