Font Size: a A A

Analysis Of Individual Credit Evaluation Indicators Based On Random Forsets

Posted on:2019-05-31Degree:MasterType:Thesis
Country:ChinaCandidate:M Q WangFull Text:PDF
GTID:2348330545998908Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
In today’s society,credit consumption has gradually entered into people’s life as a new way of life.The data show that more and more residents have changed the way of financial management from traditional savings to loan consumption,and the current stage of our country is in the dominant position of cash.It is a very difficult and very important job to judge whether the customer is reliable and whether the phenomenon of credit fraud will occur.The main purpose of this study is to analyze the many indicators that affect the credit evaluation and establish the corresponding index system when the financial institutions such as commercial banks and other financial institutions make loans to the customers,and classify them in the establishment of the established index system.Finally,we hope to determine which model is more precise for the evaluation of the classification.Which model is more general and applicable to the conclusion of daily life?Based on the analysis of the main classification model of credit institutions and the advantages and disadvantages of the random forest,this paper selects the random forest combination algorithm based on the unbalanced data to analyze the personal credit evaluation.Finally,the simulation experiment on the real credit card exchange data for a commercial bank credit card holder is carried out.Valuable policy recommendations for credit evaluation of commercial banks.Firstly,because the common credit data are numerous and complex,and the number of data is large,and the amount of data is large,before the formal data modeling,we preprocess the original data and use the principal component analysis to achieve the purpose of simplifying the random forest input,reducing the dimension,and extracting the main information.A reasonable evaluation index system of personal credit.Secondly,as a single classifier,the decision tree often has a great limitation in dealing with practical problems.This paper selects a combination classifier-random forest.And because of the real cases of credit evaluation,fraudulent customers are often very few,so how to deal with similar unbalanced data is also a problem we need to face.Therefore,this paper first uses the undersampling method to sample the majority of the training samples to sample a number of sample subsets,and a small number of samples.In combination,a new training sample set and a random forest algorithm are proposed.A new algorithm based on the combination of undersampling and random forest is proposed.Finally,the simulation test of the credit data of a European bank,In the following empirical analysis,we compared the classification performance of different classifiers(SVM,Logistic,RF)under different sampling scales(1,2,3 respectively)In random forest,the classification performance(measured by Recall,F-mean,AUC)of random forest that balances the data is better and better than other models.The results show that there are many indicators that affect the credit evaluation and the data are unbalanced.Therefore,in the process of the model construction of the customer’s credit evaluation index analysis,when the data is extremely unbalanced,the balanced processing of the data can effectively improve the performance of the model classification;In the process of building the model,a variety of models are compared and analyzed.The conclusion is that the combined model of random forest after data processing is better than the other models constructed in this paper.Therefore,this model can be widely applied to other fields,with high accuracy and good applicability.
Keywords/Search Tags:credit evaluation, decision tree, random forest, unbalanced classification problem
PDF Full Text Request
Related items