| Data is an important force for extracting effective information,making scientific decisions,and driving the modernization of agriculture.China is a big country in agriculture.Agriculture is in a fundamental position in the national economy.A long history of agriculture has enabled China to accumulate abundant data in all aspects of production,circulation,and consumption.In recent years,with the acceleration of the process of agricultural modernization and the promotion of the rural revitalization strategy,various information technologies have rapidly swarmed into the agricultural sector,and the volume of agricultural data has shown explosive growth.China’s agriculture is currently characterized by lagging information technology,covering a wide range of data,data sources are complex,closely related to time and space,long production cycle,the data quality problems are endless,the data concentration is not only many abnormal data in general,there are many seemingly normal,Actually results from abnormal data of completely different mechanisms.Data analysis work faces the dilemma of “rich data and poor information”.In the era of big data,it is impossible to prevent the generation of abnormal data,and it is difficult to eliminate it by technical means.Therefore,it is of great significance to construct abnomal data validation model and excavate data from the dataset that appears to be normal and abnormal actually,find out the hidden information behind it,and use it to make more scientific decisions.In this paper,Benford-SVR anomaly data test model was constructed based on two tools for anomalous data testing—Benford’s law and SVR.The precipitation data set in agricultural natural field and production data set in social field are analyzed,which enriches the theory and technical means of abnormal examination of agricultural data in China,and looks forward to the future development direction of abnormal data test model in China.The precipitation datasets and social datasets in the natural fields of agriculture were analyzed,which enriched China’s agricultural data anomalies.The theory and technical measures of the inspections look forward to the future development direction of the abnormal data inspection model.First of all,this paper starts from the research background and significance,and studies related methods to improve the efficiency and accuracy of the anomaly test.Benford’s law and SVR’s two anomalous data mining tools are selected,and Benford’s law is described as a data set.The probability of a fixed logarithm distribution with the probability that the first digit is 1-9 meets the basic principle of screening the anomalous data pool,and SVR has a strong nonlinear mapping ability,which can consider the smoothness of the regression curve as a whole and does not tend to the method that eliminates abnormal points excavating of individual large regression errors;Secondly,according to Benford’s law screens abnormal data with high efficiency,but the largert range,while the SVR abnormal data mining with high accuracy and robustness,but mainly for small samples.Benford –SVR abnormal data model was constructed with the help of combined model,which based on Benford’s law screening the abnormal data pool,selected high quality data as the training sample of SVR,data in the abnormal data pool as the prediction sample,and mining abnormal data from it;Again,the Benford-SVR anomaly data test model was used to empirically analyze the 65-year precipitation datasets in China and the 4-year production datasets of 7 cities in Hebei Province.It was concluded that the overall and local quality of the precipitation datasets in China is relatively high,and the datasets quality of Chuzhou,Handan and Xingtai in Hebei province are good while the datasets of Baoding,Shijiazhuang,Tangshan and Zhangjiakou are suspicious and excavate anomalously large data points.Finally,according to the empirical analysis results,it is pointed out thatBenford-SVR model is an effective method to test the field of agricultural natural sciences and social sciences.It quickly and accurately mines outlier data points in the data set.However,unsupervised learning patterns can have accidental errors.In the follow-up work,abnormal data points should be identified,and efforts should be made to dig out the effective information behind it.The research results show that the Benford-SVR anomaly data validation model can effectively test the precipitation dataset in the natural science field and the production data in the social science field.It can efficiently and precisely mine abnormal data points with significant advantages.Future development needs scholars in all fields to further study their essence,strengthen the combination of other algorithms,improve and expand their applications,enhance the efficiency and accuracy of outlier mining,improve the utilization rate of agricultural data and promote the development of agriculture. |