| With the continuous increase of credit scale in our country,various credit problems of banks and merchants have also emerged.Therefore,it is crucial to build an efficient and stable credit evaluation model for bank merchants.Credit evaluation is a two category problem.Most banks use machine learning single model or stacking fusion model to make classified prediction of credit evaluation,and use logistic regression model to convert it into credit score score.It has been difficult to meet the increasingly complex credit business needs of banks,which is mainly reflected in the low classification accuracy and the failure to make secondary classified prediction for "fast qualified" merchants with credit scores converted by logistic regression linear.In response to the above issues,the main research content of this article is as follows:(1)Propose the NT(New Random SMOTE+Tomek Links)hybrid sampling algorithm for handling sample imbalance.Combining NR(New Random)-SMOTE oversampling and Tomek Links undersampling,NT hybrid sampling algorithm is proposed.And three sampling algorithms,NR-SMOTE,Tomek Links,and NT,were used for sample balance processing,and validated on two public datasets,Wine and Heart,to find the optimal data imbalance processing algorithm and achieve the optimal balance state in this dataset.(2)Construct a credit evaluation model based on machine learning.The traditional grid search method and the improved grid search method are respectively used to optimize the parameters,and seven credit evaluation models,namely,random forest,logistic regression,XgBoost,LightGBM,CatBoost,k-nearest neighbor,and support vector machine(SVM),are respectively constructed.The results indicate that the improved grid search method significantly reduces the model runtime.(3)Propose an improved stacking credit evaluation model.On the basis of Chapter 3,four stacking credit evaluation models based on grid search method and improved grid search method are constructed respectively.Then three improvements were made to the stacking model.namely,1)building a three-layer stacking fusion model,placing the models with good classification effect in Bagging and Boosting in the first and second layers,2)adding the original data set to the input of the second layer model,3)weighting the results of the single model with its F1 value as the input of the next layer.The experimental results indicate that the improved stacking model has a certain improvement in classification and prediction performance.(4)Design and implement a credit scoring system.The improved stacking model is used to predict the second classification of the "fast qualified" merchants with the credit score of logistic regression linear transformation,and build a credit scoring system.The system includes two roles:merchant and administrator.The merchant’s main function includes querying personal credit ratings and credit rating history.The administrator’s main function includes reviewing the merchant’s credit ratings and viewing credit rating trends. |