Font Size: a A A

Research On Outlier Detection Approach For Agricultural Data Processing

Posted on:2017-03-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:B Q YinFull Text:PDF
GTID:1223330512950450Subject:Agricultural information technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology, especially the great progresses of Internet of Things, a large amount of data have been accumulated in the agriculture production field. Dirty data accompanies the tremendous increase of data volume, degrades the date quality. Outlier detection is an effective approach to ensure accuracy and reliability of sensor data. Moreover, outlier detection is also used to discover and provide useful information. Based on the demand of agricultural data processing, the approaches of outlier detection for agricultural sensors data and near infrared spectral data were studied in the paper. The main contents are as follows.(1) The approaches of outlier detection for sensor data are studied. The sensor data is generating in form of stream, which arrives instantly, continuously. Furthermore, there is a certain spatio-temporal correlation among sensor data. Based on these characteristics of sensor data, outlier detection approaches based on forecast model for single sensor data were studied firstly. Then, we focused on the outlier detection approaches of multi-sensor data, and proposed an outlier detection framework, which included two steps. The first step of the framework is outlier detection online, and the other is outlier sources identification offline. In order to identify the outlier sources, a classification method was presented. Then two algorithms were proposed. One is outlier detection based on neighbor difference and clustering, and the other is outlier sources identification based on correlation. A real dataset were used to evaluate the proposed approach. As a result, the experiments show that the approach achieves great performance on detecting outliers and identifying the sources of outliers. The accuracy of the identifying the sources of outliers is 95.8%. The time complexity of the new algorithm is similar with the traditional algorithm.(2) Outlier detection method for near infrared (NIR) spectroscopy analysis are stuied. The NIR spectroscopy technology is widely used in the agricultural field due to its rapid, non-destructive and other characteristics. On the other hand, there are some disadvantages of NIR spectroscopy such as low noise-signal ratio, low effective information rate and so on, which have strongly affect the performance of the prediction model in NIR spectroscopy analysis. How to detect and eliminate the outliers is a major important procedure in NIR spectroscopy analysis. On the basis of theoretical deduction, a new outlier detection algorithm base on joint XY distances (ODXY) was presented. The experiments show that ODXY method has better performance and better generalization ability than the other approaches which were tested in our experiments.Based on the superimposition of NIR spectroscopy and the ODXY approach, a special outlier detection algorithm for NIR multicomponent analysis is proposed and proved, termed as MODXY approach. The experiments show that, in most cases, MODXY method has better outlier sample recognition capability in NIR multicomponent analysis. On the other hand, both ODXY method and MODXY method have their own suitable range. They are more effective when the relative standard deviation of components is large enough.
Keywords/Search Tags:Outlier Detection, Data Stream, Sensor, Near Infrared Spectroscopy, Data Processing
PDF Full Text Request
Related items