| Credit fraud seriously affects the normal operation of credit business,and malicious fraud brings serious economic losses to financial institutions.Therefore,fraud detection is a very important part of credit risk management.There are two main reasons for customers’ risks: one is that their own attribute characteristics are abnormal;Second,due to the peer effect(Chen Qiang,2010),it is affected by the risk customers in the associated network.The customer’s own attributes include demographic information,assets and liabilities,behavior preference,and credit history,etc.Associated networks mainly include related information,such as customers,contacts,guarantors and transactions.In recent years,the academia and industry mainly use supervised classification algorithms,such as logical regression,decision tree and support vector machine,to build fraud detection models.These classification models mainly use the attribute characteristics of customers,but do not use the information of related networks.Therefore,in this scenario,the classification model cannot depict the peer effect between users,and cannot identify the risks hidden in the associated network.To solve this problem,this thesis establishes a blockwise network autoregressive model and a fraud graph attention network model to analyze the interaction between characteristic variables and target variables in the loan applicant’s associated network.Inspired by the aggregation and risk similarity characteristics of fraud gangs,this thesis proposes a blockwise network autoregressive(BWNAR)model to quantify the impact of network connectivity on risky customers.The model assumes that the entire individual network is composed of non-overlapping blocks to depict the aggregation characteristics of fraud groups.The network regression coefficients of nodes in each block are equal(called network influence factors in this thesis).This coefficient reflects the risk impact capacity of the block,thus realizing the identification of fraud groups.In order to reduce the complexity of model estimation,this thesis adopts a two-step method.First,the pseudo likelihood ratio(pseudo-LR)binary segmentation method based on spectral clustering is used to estimate the number of blocks and their members of the network;Then,without imposing any distribution assumption,the quasi-maximum likelihood estimation(QMLE)method is used to estimate the network regression coefficient,independent variable regression coefficient and asymptotic variance of the block,and the consistency and asymptotic normality of the estimated parameters of the model are strictly proved.In addition,this thesis also proposes the likelihood ratio test statistics to carry out the significance test and multiple test of parameters,and strictly establishes the asymptotic theoretical properties of the test.The performance and applicability of the model are verified by simulation experiments and real experiments of credit fraud scenarios respectively.BWNAR model focuses on analyzing the overall influence factors of different network blocks.In order to further analyze the differences of network influence factors between different nodes in the same block,and observe how risk spreads among nodes,this thesis proposes a fraud graph attention network(Fraud GAT)model for fraud detection scenarios.This model carries out WOE segmentation and self attention learning for node features,GAT breadth attention learning and Bi-LSTM depth attention learning for edges.Then,the node attention learning result vector and the edge attention learning result vector are concatenated,and then the activation function is used for classification prediction.WOE segmentation of node features enables better business interpretation of feature variables;Self attention learning of node features can extract important information from features and filter out useless information;the edge’s GAT breadth learning and Bi-LSTM depth learning can learn the important risk propagation path between nodes in the block network.Compared with Logistic,BWNAR,GAT,and Genie Path models,Fraud GAT model achieves the best performance.This shows that the network influence factor between nodes can significantly improve the fraud detection.Finally,we conduct a comparative analysis of the prediction results of the BWNAR and Fraud GAT models based on real data.The results of the two models are consistent,and it is found that the sub-network with tight connections and no depth path feature has a smaller influence factor and lower risk,while the sub-network with sparse connections and depth paths has a larger influence factor and higher risk.This is basically consistent with the logic in reality.The nodes in the tightly connected network know each other,so they have little influence on each other and small influence factors,but they play a role in increasing credit and have little risk.In a sparsely connected network,the relationship spreads in a small range,the influence between nodes plays a greater role.The influence factor is large,but there is no role of credit enhancement,and the risk is high.To sum up,this thesis proposes BWNAR and Fraud GAT model based on the network block structure and considering the overall and local influence factors of the network block respectively.The two models address the problem that the standard Spatial Autoregressive Model and Graph Neural Network model have insufficient ability to analyze the network block structure.At the same time,we strictly prove the asymptotic properties of parameter estimation and parameter test.The actual data analysis shows that the model proposed in this thesis has a good effect in the credit fraud detection scenario,and has a good business interpretation ability. |