Font Size: a A A

An Ensemble Learning Approach To Personal Credit Risk Assessment

Posted on:2020-01-03Degree:MasterType:Thesis
Country:ChinaCandidate:Q F YuanFull Text:PDF
GTID:2439330623952520Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
This paper studies the personal credit risk assessment based on unbalanced data.The data source is the historical business data of lending institutions provided by topic 2 of the "DongZheng Future Cup" in China Undergraduate Statistical Contest in Modeling.The total number of users recorded is 30,000,and the number of default users is 1,532,which is a highly unbalanced data set.Firstly,the missing value processing compares KNN filling with missForest filling.The missForest padding method has a larger AUC value on the validation dataset than the KNN padding.In the aspect of variable system index selection,this paper proposes an improved method of random forest feature selection.Since credit data is unbalanced data,this paper compares the effects of modeling the data after the data is balanced with the modeling method of class imbalanced ensemble learning.The class imbalanced ensemble learning has the best prediction result,and its test set AUC value can be increased by 3.8 percentage points.The experimental results show that Bagging has better prediction effect on the dataset than RUSBoost,and the random forest-based learner has the largest AUC value on the test dataset.
Keywords/Search Tags:Credit score, Unbalanced data, Ensemble learning
PDF Full Text Request
Related items