Font Size: a A A

Research On The Factors Influencing Successful Targets And Default Risk Of Borrower On P2P Platform

Posted on:2021-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y D OuFull Text:PDF
GTID:2439330611962871Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
In today's society,due to the problems of small scale and high operational risk,small and micro enterprises and individuals are difficult to obtain loans from traditional financial institutions,which leads to the financing difficulties for small and micro enterprises and individuals.Fortunately,the Internet,as an important technology,can effectively reduce the risk and cost of review,so as to achieve transparency in operation and make microcredit possible.It is under the background of the rapid development of the Internet that P2 P online lending has developed rapidly in China's financial industry.In recent years,however,there are a large number of illegal operations in P2 P online lending.For example,it is not uncommon that there are illegal raising funds,senior executives running away and difficulties in withdrawing cash on the platform.Therefore,through big data analysis methods such as machine learning and neural network,this paper explored the factors that significantly influence the borrowers' successful targets,so as to help the borrower improve the success rate of borrowing.This paper explored how to improve the ability of the lending platform to identify borrowers' defaults,and selected a model that significantly improved the prediction probability of borrowers' defaults by P2 P online lending platforms,which had guiding significance for improving the early warning ability of lending platforms to identify borrowers' defaults.In chapter 1,this paper mainly introduced the research background and significance of P2 P online lending,as well as the research framework and organizational structure of this paper;the research status of P2 P platform online lending and unbalanced data processing at home and abroad.In chapter 2,this paper mainly introduced the sources of P2 P online loan data,the processing of missing values in the data,the judgment of outliers and the transformation of related features.Meanwhile,this paper made an exploratory data analysis on the data of the factors influencing the borrowers' successful targets and the data of P2 P platform to identify the borrowers' default.This paper preliminarily explored that the four characteristics of Bidders,SuccessfulNum,TotalAmount and GuaranteeWays were the factors that significantly influence successful targets.At the same time,the five characteristics of AdvanceAmount,DefaultAmount,SeriousDefaultLoan,DefaultTimes and PrinAndInterest were also the factors that obviously influence whether the borrower could borrow money successfully or not.In chapter 3,variance selection method,spearm correlation coefficient method,feature selection method based on tree model and recursive feature elimination method were mainly used for feature selection.CreditRating,RepaymentPeriod,Credits,APR,LoanNumbers,Age and BidAmount were selected for the data of the factors influencing successful targets.CreditRating,APR,LoanNumbers,RepaymentPeriod,Credits,Income and Education were selected for the data identifying the borrowers' default on the P2 P platform.In chapter 4,Logistic regression,CART decision tree and k-modes cluster analysis were mainly used to model and analyze the data affecting the successful targets of borrowers.The purpose of these methods was to research on which factors significantly affected the borrower's successful targets probability.Firstly,when Logistic regression was used to explore the factors significantly impacting the borrowers' successful targets,it was found that CreditRating,Credits,RepaymentPeriod,APR,LoanNumbers and BidAmount were all factors significantly affecting the borrowers' success in lending.Then,it was found that the two characteristics of CreditRating and Credits had the greatest impact when classification was carried out with CART decision tree.Moreover,when CreditRating was "AA and A" and Credits was greater than 3250,the probability of the borrowers' successful targets was greater.Finally,based on the exploratory data analysis of chapter 2 and the modeling analysis of chapter 4,this paper using the five characteristics of the discretization of the Bidders,SuccessfulNum,TotalAmount,GuaranteeWays and CreditRating.In the case that the optimal number of clustering was 2,k-modes algorithm was used for clustering analysis.The results showed that the algorithm could accurately cluster borrowers into the two categories of "successful targets" and "unsuccessful targets",and the final clustering accuracy was as high as 99.6%,the clustering effect was very good.In chapter 5,NearMiss undersampling,Random Forest,BP neural network and GBDT were mainly used to model and analyze the data of P2 P platform to identify the borrowers' default.First of all,the NearMiss undersampling method was used to balance the categories.Secondly,Random Forest,BP neural network and GBDT were used to predict and analyze the borrowers' default.Through comparative analysis,this paper found that GBDT integrated learning method based on Boosting had the best classification effect.With accuracy of about 93% and AUC value of up to 97%,which could effectively predict and analyze the borrowers' default.At the same time,based on the selected features and the optimal GBDT model,one of the four features PrinAndInterest,DefaultAmount,AdvanceAmount and SeriousDefaultLoan was added respectively,the classification accuracy was about 99%.Moreover,the relative influence(rel.inf)was more than 93,even as high as 99.79,which showed that these four features had a great influence on the classification effect of the model.In chapter 6,this paper mainly summarized the main work and pointed out the shortcomings.At the same time,this paper put forward specific suggestions about how to improve the lend success rate of borrowers and how to improve the default probability of P2 P online lending platform.We hope to contribute to the sustainable and healthy development of P2 P lending industry.
Keywords/Search Tags:P2P online lending, Logistic regression, Random Forest, BP neural network, GBDT
PDF Full Text Request
Related items