Font Size: a A A

Based On The Hybrid Model Financial Fraud Alert

Posted on:2020-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:X CaiFull Text:PDF
GTID:2439330590971025Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
With the continuous improvement of the capital market,China's economy is increasing day by day,which brings unlimited development opportunities and platforms to enterprises.However,in this economically favorable situation,corporate financial fraud has occurred frequently.The frequent occurrence of financial fraud will hit investors' confidence to a certain extent and hinder the healthy and stable development of the capital market.At this stage,China's capital market has entered the critical period and deep water area.In order to promote the healthy and stable development of the capital market and maintain the capital market order,a comprehensive financial fraud early warning model is constructed to effectively identify the possibility of financial fraud in listed companies.Effectively curbing the occurrence of financial fraud,and escorting financial market information disclosure has become the top priority of this paper.Through reading a large number of domestic and foreign literatures,it is found that traditionally,many students at home and abroad often use the empirical discriminant method to identify whether a company has financial fraud problems,that is,relying on its own professional knowledge and years of research experience to judge whether a company has financial fraud..However,with the advent of the era of big data,auditors need to analyze and analyze the financial data of a large number of listed companies.If traditional empirical methods are still used,it will take a lot of time and labor costs.With the continuous development of technology and technology,domestic and foreign scholars have proposed the use of machine learning to solve the problem of financial fraud identification.Based on the data mining perspective,different indicators and data are selected to establish an effective financial fraud early warning model.However,at present,China's use of machine learning methods to identify financial fraud is still in the exploratory stage.Most scholars use models such as logistic regression,DNN and SVM.However,it was found that the recognition effect of financial fraud obtained by using different samples and different models was uneven.The analysis of the cause of the problem is mainly due to the traditional division of financial fraud samples and non-financial fraud samples,only the CSRC,the Shanghai Stock Exchange and the Shenzhen Stock Exchange announced that the enterprises that have financial fraud problems are classified as fraud samples,but not for The companies that criticized the notice were classified as non-financial fraud samples.However,in fact,we cannot accurately judge whether the sample that has not been criticized by the Securities and Futures Commission,the Shanghai Stock Exchange and the Shenzhen Stock Exchange is really free of financial fraud.If the enterprise that has not been criticized by the announcement is directly introduced into the model as a non-financial fraud sample,the model will have a certain degree of recognition bias.Therefore,some scholars have proposed the use of empirical discriminating methods.In the case of companies that have not been publicly criticized for the selection of limited financial institutions,non-financial fraud samples,but this practice still has certain drawbacks,and the financial situation is poor.Identification information of companies that do not have financial fraud can not be brought into the model.By reading a large amount of literature,I found that the PU Learning strategy can effectively solve the problem of data sets being positive samples and unlabeled samples.Therefore,this paper is building a financial fraud early warning model,mainly based on PU Learning solutions:First,on the collection of samples,the samples that were criticized by the Securities and Futures Commission,the Shanghai Stock Exchange and the Shenzhen Stock Exchange were recorded as positive samples P,and the samples that were not publicly criticized were recorded as unmarked samples U.Second,in the selection of indicators,mainly refer to the financial fraud identification literature of many students in China,and select some financial indicators of the listed company in the current year.Considering that the financial fraud enterprises may have abnormal changes in the financial indicators of the previous two years,this paper not only introduces the financial indicators of the current year,but also introduces the financial indicators of the previous two years.In addition to selecting financial indicators,this paper also introduces non-financial indicators such as financial report review opinions of listed companies and securities firms' attention.Thirdly,in the establishment of the model,this paper firstly trains the effective semi-supervised model of the P+U dataset through the two-stage stage,so as to screen out the financially reliable samples.Since there is a sample imbalance between the financially reliable sample and the fraud sample calculated by the two-stage stage,this paper extracts 30 sets of financially reliable samples that are returned in the same year and the same as the financial fraud sample before the direct stage.The same number of samples in the industry and listed companies are the opposite samples.Then,Logistic regression,GBDT model and Xgboost model are established for 30 data sets in the direct stage.The integrated training method is used to integrate the trained 90 models to obtain a final hybrid early warning model.The main innovations of this paper are as follows:(1)In the division of samples,different from traditional analytical thinking,the sample is divided into positive and unlabeled samples,and the financial learning model is identified by PU Learning strategy.And in the processing of the unbalanced data set,the combination of repeated sampling and downsampling is used to extract 30 data sets.(2)In addition to the financial indicators of the enterprises that have been traditionally used in the current year,the paper also introduces some financial indicators of the previous two years and non-financial indicators of the year.Through the analysis of the problem of financial fraud,the final result of this paper is that the AUC value of the fraud prevention warning model based on PU Learning strategy is higher than 0.88,which indicates that the financial fraud early warning model constructed in this paper is effective and through the identification of fraud factors.The importance of ranking,found that "equity concentration","previous year's total net profit margin" and "being the analyst's attention" indicators can be used as an important identification factor to identify whether the company has financial fraud.
Keywords/Search Tags:Financial Fraud Warning, PU Learning, Ensemble Learning
PDF Full Text Request
Related items