The content of heavy metals in soil directly affects people’s life safety and management decision of relevant departments.For the heavy metal data obtained from soil investigation,due to the data error or the actual content anomaly,it often shows significant differences with the data in its neighborhood.Therefore,it is necessary to identify the anomaly of soil heavy metal data to guide the follow-up scientific processing,so as to eliminate or weaken the impact of abnormal data.At present,the research on outlier detection mainly considers attribute value,time and space.The traditional outlier detection method based on temporal and spatial attributes can detect outliers only by using temporal and spatial attributes.It is necessary to consider both temporal and spatial scales when identifying anomalies from multi-stage and non fixed soil heavy metal detection data.At present,in the aspect of spatiotemporal anomaly recognition,only spatial autocorrelation or temporal autocorrelation is considered,and its spatiotemporal autocorrelation is not considered;when determining the spatiotemporal neighborhood,the subjectivity is strong,and the stability of spatiotemporal neighborhood of anomaly detection is not considered,and the influence of anomaly type on data subsequent processing cannot be distinguished.Based on the above problems,this study proposed a set of spatiotemporal integration of soil heavy metal pollution data anomaly recognition method.This method uses k-nearest distance to obtain stable spatiotemporal research neighborhood,uses spatiotemporal Moran index based on spatiotemporal correlation to detect spatiotemporal outliers of soil heavy metals,and extracts strong correlation index based on correlation analysis to assist in anomaly type recognition.Based on this method,a spatiotemporal integrated anomaly recognition system for soil heavy metal pollution data was developed.Finally,taking the multi-year farmland area of Beijing as a case study,this method is used to identify the anomalies of heavy metal data in this area,and the results of anomaly identification are compared and analyzed by combining the variation characteristics and interpolation accuracy.The results showed that there was an anomaly of suspected data error in point 58 of 2005 for Cu,and an anomaly in point 6,25,27 and 43 of 2005 for CR.The results showed that there were some anomalies of suspected data errors in the heavy metal as samples of No.13,No.33 and No.88 in 2005 and No.225 and No.258 in 2007.Based on the comparative analysis of variation characteristics,the results show that after removing the global,local and suspected data error outliers from the original sample points,the overall dispersion degree of the sampleis reduced,and the spatial autocorrelation degree and regional structural variation trend of the sample are enhanced;after removing the global and local outliers,the interpolation error of the data space is significantly reduced,and the suspected data error outliers are removed The post interpolation error is basically the same as the interpolation error when removing the local abnormal points,indicating that the removal of the suspected data points in the original sample points has little effect on the overall interpolation estimation,which proves the effectiveness and accuracy of the method. |