Font Size: a A A

Loan Default Prediction Based On Knowledge Graph And Machine Learning

Posted on:2024-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:J YuanFull Text:PDF
GTID:2568307142452094Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Loan default risk refers to the possibility that borrowers are unwilling or unable to fulfill their obligations to repay matured debts,which constitutes a contract breach.It is related to the financial safety and stable operation of banks and other financial institutions.Predicting loans that may default can effectively reduce the potential losses of loans already issued.At present,financial institutions usually rely on mathematical statistics or machine learning algorithms to predict loan default.These methods cannot effectively deal with complex entity relationships in financial data,while knowledge graphs can effectively mine Semantic information in data.Therefore,this thesis proposes an IGWO-XGB-KG model combining knowledge graph and machine learning technology.On the basis of building a financial knowledge graph,we use knowledge representation learning to embed the financial knowledge graph into a low dimensional dense vector space,obtain the embedding vector of loan entities,and input it into the XGBoost model together with other features for training,Finally,the grey wolf optimization algorithm is improved by Logistic-Tent chaotic mapping and elimination mechanism,and the hyperparameter of XGB-KG model is optimized.The main research content of this thesis includes the following aspects:(1)Deeply explore the data quality of online lending datasets derived from real business scenarios,analyze data features in a visual manner,and then perform data preprocessing.The semantic data model is designed according to the business rules,the financial knowledge graph is constructed in a top-down manner,and stored in the Neo4 j graph database.(2)An XGB-KG model combining knowledge representation learning and extrme gradient boosting is proposed.Based on the two-layer relational graph convolution network and the Compl Ex model,a graph self-encoder structure is built,and the vectorization representation of the financial knowledge graph is input into the XGBoost model as a supplementary feature.The experimental results show that embedding knowledge graphs as supplementary features can improve the performance of classification models in predicting loan default risk.Compared with XGBoost,the XGB-KG model proposed in this paper has improved in all evaluation matrics.(3)Considering that the XGB-KG model has many hyper-parameters and complex structure,this thesis proposes an improved IGWO algorithm based on the gray wolf optimization algorithm,combined with Logistic-Tent chaotic mapping and elimination mechanism,and optimizes the hyper-parameters of XGB-KG model with the help of swarm intelligence optimization algorithms.The experimental results show that the IGWO-XGB-KG method proposed in this thesis has improved the F1 and AUC scores by5.5% and 3.6% respectively in loan default prediction tasks compared to XGBoost.In summary,this thesis provides a reference for the association of multi-source data and the application of knowledge graphs in loan default prediction tasks.The constructed loan default prediction model has achieved good classification results and can accurately and efficiently identify potential defaulting loans.It can provide a reference basis for risk management in the financial field and is of great significance for maintaining financial fund security.
Keywords/Search Tags:loan default forecast, knowledge graph, machine learning, graph convolution network, grey wolf optimizer
PDF Full Text Request
Related items