Font Size: a A A

A Comparative Study Of Personal Credit Evaluation Model On Net Lending Platforms Based On Extremely Randomized Trees With Logistic Regression Algorithm

Posted on:2021-01-04Degree:MasterType:Thesis
Country:ChinaCandidate:L F WeiFull Text:PDF
GTID:2439330602483562Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology in recent years,Internet finance gradually become a general trend.As P2P(peer-to-peer)is the important part of the Internet finance,its development is to be reckoned with.Although P2P brings us convenient,it has corresponding risk.Because our country's regulations and policies have not matured yet,there appeared a lot of illegal and credit risk events.In such situation,our country's financial regulation department continuously publish relevant policy.Exit and transformation will become one of the development tendencies.In the future,how to use financial technology will become the important indicator of P2P platform's competitiveness.In such condition,P2P platform need to focus on the personal credit risk assessment,so that to reduce the risk of P2P companies and lenders.Based on the above background and combined with my internship experience,I find that the majority in domestic use Logistic regression algorithm to con-struct personal credit risk assessment model,and few use extremely randomized trees algorithm to construct model.In this article,we will compare the Logistic regression algorithm and extremely randomized trees algorithm in constructing personal credit risk assessment model.This paper will mainly research from the following aspects.First of all,through the introduction of the development of P2P at home and abroad,and analyzing evaluation indicator and the research of personal credit risk assessment model at home and abroad,I come up with the content and purpose of this paper and the significance of application in practice.And then this paper introduces Logistic regression algorithm and extremely randomized trees algorithm model theory and so on.Secondly,through data acquisition,data cleaning,feature selection building model these steps to build the personal credit risk assessment model.Using Borderline-SMOTE algorithm to balance the data that collected before.Then,construct the model based on Logistic regression algorithm and extremely ran-domized trees algorithm respectively and do the comparison.Because the data was collected during I intern in a domestic Internet technology company,the research has certain practical significance.Finally,after training simulation of the model and tuning parameters,the model meets the desired requirements.And comment on their advantages and disadvantages of these two models.The result shows that the model based on extremely randomized trees algorithm outperforms the model that based on Lo-gistic regression model.And in the process of constructing these two models,I found that load kind of characteristics of users have big influence on the forecast effect of model.Therefor,in the collection of user's data,we should give priority to users' load kind of characteristics.The collection of other characteristic data is complementary.This paper base on the data collected from an Internet financial technology company.And establish a personal credit risk assessment model.It has a high application value in practice.In advance to predict customer's risk of default,can help P2P platform and lender reduce the risk and improve the competitive of P2P company.At the same time,it also can prompt the development of domestic P2P.
Keywords/Search Tags:P2P, Extremely Randomized Trees, Credit Risk Assessment, Logiatic Regression, Ensemble Learning
PDF Full Text Request
Related items