The changes in consumption concepts have allowed more people to use loans to enjoy services in advance.The resulting credit risk issues have revealed the inadequacy of many financial institutions in handling loans.At present,an effective way to reduce the risk of financial institutions is to establish a credit risk assessment model based on the basic information and behavioral information of customers.However,the imbalance of basic credit information and the inconvenient use of transaction behavior information limit the effectiveness of credit risk assessment.This thesis proposes corresponding solutions to these problems.The contributions of this thesis is described as follows:(1)Aiming at the imbalance of credit data,we propose a method based on weak classifier mixed sampling for imbalanced credit data.The credit data is subdivided by multiple weak classifiers,and then different sampling modes are used for the minority data and the majority data according to the classification results to realize the balance of the credit loan data.The experimental results show that compared with other data balancing methods,the method proposed in this thesis can make the model have better recognition ability for minority data on four credit loan data sets.At the same time,the balanced data of our method will not cause serious loss of the information content of majority data.(2)Aiming at the problems that the existing sample distance measurement methods are not suitable for credit loan data and the dynamic ensemble model is low in flexibility,we propose a credit risk assessment algorithm based on basic customer information.First,this thesis proposes a method for measuring the distance between samples for credit loan data to optimize the process of dynamic ensemble neighborhood selection.At the same time,the strategy of stack integration is used to improve the original integration method.The experimental results show that compared with the traditional distance measurement method,the method proposed in this thesis is more suitable for credit loan data.Compared with other credit risk assessment models that use basic information,the method in this thesis has more advantages in identifying default users of credit loan data,and it also has an excellent effect in identifying all users.(3)Aiming at the problem that transaction behavior information is difficult to process and apply,we propose a credit risk assessment algorithm that integrates transaction behavior.In this thesis,the basic information of customers and the behavioral information at different time granularities extracted by using multi-scale convolution are integrated to build a credit risk assessment method that can be combined with user behavior data,which realizes the combination of time-series behavior information and traditional basic information.The experimental results show that transaction behavior information has a more important role in credit risk assessment.Compared with other methods,the method proposed in this thesis can obtain the information in the transaction behavior more comprehensively and obtain the best credit evaluation effect.In summary,the imbalanced data processing method proposed in this thesis can effectively improve the quality of credit data and the credit risk assessment model based on dynamic ensemble and the credit risk assessment model that integrates transaction behaviors proposed can more accurately assess the user’s credit situation.The credit scores we obtain can effectively help financial institutions reduce credit risk,and at the same time can provide suggestions for related risk management and control. |