| As the subject of market,financial information of enterprises is important for investors’ decision-making and economic policy-making.However,financial fraud will mislead economic performance.Therefore,identification of financial fraud has always been a classic topic.The traditional studies tend to use financial(like financial ratios)and non-financial(like structure of governance)information.Yet internet information(as analysis and report of financial fraud in the Internet)is more direct to financial fraud,thus using internet information will help to detect financial fraud.However,the use of Internet information involves copyright issues and needs authorization,use of crawlers is not a best practice,directly trade internet information is not economically feasible due to the easy-to-leak of information and the characteristics that value of information will decrease when more people know it.Therefore,we use Machine Learning with Privacy Preservation to solve the problems.The principle of this technology is to let internet platform computing model parameters using their internet information on demand of jobs,thus it’s the parameter of model,instead of original information,that are published,so can avoid these problems of copyright,technique and economic,help machine learning better apply to the practice of financial fraud detection.In this paper,we gather the financial and non-financial information of 16,112 samples from 2012 to 2020 respectively,and constructed the internet information features.Based on the experimental purpose,internet information resources are from search engine and webcrawler.3 models were constructed: Model 1 uses only financial and non-financial information,Model 2 uses all information and traditional machine learning methods,and Model 3 uses all information and Machine Learning with Privacy Preservation(HeteroSecure Boost and Hetero Neural Network).The result shows that,compared with model 1,Model 2 is improved by 7 to 10 percentages in all models,indicates that the introduction of internet information can help improve data quality.And the consistence of result of Model 2and 3 verify that,in the field of financial fraud detection,with compliance of law and regulation,Machine Learning with Privacy Preservation can fully use internet information to build a better model.In this paper,we discuss the necessity of using internet information and barrier when get and trade of internet information,besides explains the feasibility of Machine Learning with Privacy Preservation when detecting financial fraud.Using following experiment,we proved the necessity and feasibility above,which has theoretical and practical significance. |