Font Size: a A A

Research On Vehicle Loan Default Prediction Problem Based On Machine Learning

Posted on:2022-11-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:2512306611995719Subject:Investment
Abstract/Summary:PDF Full Text Request
The booming automotive industry has also driven the rapid development of the auto loan business.Auto loans are widely welcomed because of their unsecured and fast approval.However,if the potential defaulting customers cannot be predicted in time,the higher default rate will lead to significant losses for the entities providing loans.Rooted in the above problems,this paper takes the dataset of an auto finance company as an example,the original data were modeled by using the random forest model and the Light GBM algorithm,and it was found that the AUC value of the Light GBM algorithm was about 0.4 higher than that of the random forest model,but the values of each indicator of the positive sample were relatively low.Therefore,this paper processes the data by resampling,and aims to further compare the fitting effects of the two models by improving the various indicators of the two models.The study is as follows:(1)Data preprocessing,detection of missing and outlier values,deletion of single and duplicate values,data encoding and feature selection.(2)Data and algorithm level to select the optimal method,sampling methods include SMOTE method in the oversampling method,ENN method in undersampling method,SMOTEENN method in mixed sampling method,this paper tries to use the above three methods to deal with unbalanced data,separately,to establish a random forest model and a Light GBM algorithm model,and found that the Light GBM algorithm based on the SMOTEENN method has the best effect,and its AUC value is 0.9381,positive sample F1 value is 0.8886.
Keywords/Search Tags:random forest, LightGBM, auto loan default forecast, imbalanced data
PDF Full Text Request
Related items