| Nowadays the quality of data is becoming more and more important in big data area.Data is the carrier of information.When mining valuable information applying it to a certain field,the quality of the data should be the first factor to be considered.If the accuracy and authenticity cannot be guaranteed which can result in the lack of data,data errors,and confusion of data logic,it will affect our judgment not only on information,but also on people’s future development of things.Deviations in judgment are expected to result in economic losses or mistakes in decision-making.Cleaning data and improving data quality are of great significance for research and analysis with data as the entry point.The original topic of this article comes from the Chinese rural population poverty alleviation verification project that our school has been participating in recent years.I have also followed the verification team many times and visited more than 100 poor households.Statistics and analysis have collected a large amount of data.The quality of data has always been one of the most concerned issues in our School’s Applied Statistics Center.The Benford’s Rule has been used for data quality testing for more than a few decades,and the process of applying this rule to data quality testing is limited by slow developments.In this case this paper will provide a new idea based on Benford’s rule combined with SVM classification algorithm to test data quality,which makes the data in the finishing stage rely on Benford’s Rule to have a new method to test the quality of data.The research work of this paper is mainly reflected in the following five aspects:1.Research on the data quality test at home and abroad and the research status of data quality test based on Benford’s rule and the progress of the research results obtained,summarize the latest research directions.2.Starting from Benford’s Rule,exploring the advantages and disadvantages of Benford’s Rule.3.Combining Benford’s Rule with the goodness of fit test to solve the limitations of Benford’s Rule in practical applications.4.Based on the Benford’s rule test results combined with the SVM classification algorithm,the traditional Benford Rule can only be used to break through the problem of the first digit of the data.5.Summarizing the shortcomings of Benford’s Rule and SVM classification algorithm,which provides a new idea for data quality assessment.The main result of the thesis is to grasp the advantages of Benford’s Rule and SVM classification algorithm to make up for the problem that Benford’s Rule is limited to the first digit to locate abnormal data samples,and improve the application effect of Benford’s Rule in practical applications. |