Font Size: a A A

Research On The Prediction Of Individual Credit Risk Of Bank By Ensemble Learning

Posted on:2022-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:C WangFull Text:PDF
GTID:2480306311968939Subject:Statistics
Abstract/Summary:PDF Full Text Request
In recent years,with the increasing consumption demand and the continuous growth of consumption economy,the personal credit business of major commercial banks in China has provided a favorable environment for development.The emergence of the"Internet plus" business mode has led to the explosive growth of loan information,and has put forward higher requirements for the bank's ability to handle business of wind control.At the same time,due to the imperfection of China's credit reference system,the major commercial banks and other financial loan platforms are basically in their own operating state.In the case of serious data asymmetry,it is easy to default,which seriously affects the development of domestic credit market.At present,the domestic banks in the credit risk prediction is still based on the traditional algorithm,although also began to involve the integrated learn-ing algorithm,but it is not deep enough,and there is no detailed comparative analysis between the integrated learning algorithm.In this paper,we use the random forest and xgboost algorithm of ensemble learning algorithm and the traditional decision tree to model,predict and compare the latest personal credit data of commercial banks,so as to better reflect the real situation of the current domestic bank personal credit market.In data processing,lightgbm algorithm is used to fill the missing values,and compared with Miss forest filling and no filling.The results show that the ensemble learning algorithm is much better than the traditional decision tree model in prediction accuracy and stability,and xgboost as an improved gradient lifting tree,its model performance is also better than the random forest algorithm,its AUC value is more than 0.91.From the filling method of missing values,the overall performance of the model after us-ing missforest and lightgbm to fill data is improved compared with that without filling.Lightgbm filling algorithm is more efficient than missforest.Therefore,lightgbm algorithm can be used to fill data in the bank's credit risk prediction to improve the performance of the model.In the modeling of personal credit data of banks,by comparing the importance ranking of features obtained by random forest and xgboost algorithm,it is further realized that the relevant features de-scribing the personal information and asset status of lenders are very important in both models,so it is necessary to focus on collecting these highly important variable information in the subsequent business development of banks,It is also necessary to collect more types of data,such as historical loan times,historical default times,loan purpose,loan term,etc.,so as to further improve the risk early warning ability of the model.In this paper,the research of ensemble learning algorithm and lightgbm data filling technology in the real personal credit risk prediction of commercial banks has great application value in the risk control of banks,which can help improve the early warning ability of bank risk control system and promote the healthy development of domestic personal credit market.
Keywords/Search Tags:Personal Credit Risk, Random Forest, XGBoost, LightGBM Data Filling, MissForest
PDF Full Text Request
Related items