Font Size: a A A

Research On Credit Scoring Model Based On Classification Enhanced Regression Algorithm

Posted on:2024-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:X YuFull Text:PDF
GTID:2530307100489014Subject:Electronic information
Abstract/Summary:PDF Full Text Request
With the development of society and economy,credit risk evaluation has been paid more and more attention by enterprises and governments.Major enterprises have established their own credit scoring systems,including the three major telecom operators in my country.Due to the imperfect credit system,there are still phenomena such as overdue payment,breach of contract,and fraud in all operators,which have negative impacts on both operators and victimized users.Therefore,improving the accuracy of user credit evaluation of telecom operators is an urgent problem to be solved.Based on the mobile user data set,this paper conducts data mining on user data,researches and proposes a more reliable credit scoring model to assist telecom operators to mark reminders and limit permissions for risky users.The main work content of this paper is as follows:(1)After a preliminary analysis of the mobile user dataset,a hybrid outlier processing method based on isolation forest and random forest is researched and proposed.This method mainly uses the isolated forest algorithm to detect and eliminate extreme outliers,and then uses the random forest algorithm to predict new values to complete.Experimental results show that the method can effectively detect and correct abnormal data,thereby improving the quality of the dataset.(2)Research and propose a filtering feature selection method based on RRelief F and maximum information coefficient.Aiming at the defect that RRelief F can effectively remove irrelevant features but cannot remove redundant features,this method uses the maximum information coefficient to remove redundant features,and combines the advantages of the two to make up for the defects of RRelief F algorithm.After the feature set is screened by this method,the optimal feature subset can be finally obtained,which can solve the negative impact brought by irrelevant features and redundant features.The experimental results show that this method can effectively filter out the optimal feature subset,and finally improve the model performance to a certain extent,and the model training time is also reduced to varying degrees.(3)Research and propose a classification-enhanced regression algorithm based on Light GBM to build a credit scoring model.The algorithm builds a model by first using the exhaustive method to search for the best division threshold of the credit score,and then constructs a binary classification model based on Light GBM as the basic model,and then divides two types of user data according to the classification results of the basic model,and constructs a regression based on Light GBM respectively.In this way,the two types of data can be regressed and improved in a more fine-grained manner,and finally the two regression models are weighted and fused to obtain the final regression model.Experiments show that the performance of the model constructed by this method is better than that of the benchmark model,which can meet the needs of mobile user credit scoring research.To sum up,this paper proposes a hybrid outlier processing algorithm for the optimization of the quality of the data set;for the irrelevant and redundant features in the feature set,a filtering feature selection method is proposed for optimization;finally,based on classification enhanced regression Algorithms build credit scoring models,providing operators with feasible ideas in the field of credit scoring.
Keywords/Search Tags:data mining, credit scoring model, outlier processing, feature selection, classification enhanced regression algorithm
PDF Full Text Request
Related items