Font Size: a A A

Research On Intelligent Water Data Big Data Cleaning Algorithm Based On Stereo Sensing

Posted on:2020-01-24Degree:MasterType:Thesis
Country:ChinaCandidate:Q X MengFull Text:PDF
GTID:2392330623956273Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
Data plays a very important role in human life,and water data becomes difficult to deal with when it collects dirty data due to multivariate,complicated data types and strong correlations.It brings severe decision-making analysis to the water industry.The challenge,therefore,data cleaning of the water industry data is imperative.The use of data mining technology for data cleaning is a frontier problem in the field of data mining.In recent years,with the advancement of the academic direction of machine learning and statistical learning,the application of data cleaning technology in various fields has made great progress,but these methods are The cleaning methods for water data space,time and numerical characteristics still need to be improved and broken.Based on the water data space and time characteristic properties,this paper takes the space and time outlier detection algorithm as the entry point,and then studies the smart water data big data cleaning algorithm based on stereo perception.main tasks as follows:(1)Research on outlier algorithm based on spatial characteristicsAccording to the business attributes and object attributes of water data spatial characteristics and multivariate characteristics,this paper proposes to use KNN to find the adjacent function points of each point,and then use the watershed as the comparison function under weight adjustment and the Mahalanobis distance suitable for multivariate.The threshold function detects the outliers of the spatial feature attributes in the water data.The experimental results verify the accuracy and effectiveness of the algorithm in the detection of outliers in water data,and lay a foundation for further research on the smart water big data cleaning model based on stereo perception.(2)Research on outlier algorithm based on time characteristicsAiming at the time series characteristics and multivariate characteristics of water data,this paper proposes to use FCM clustering method and two fuzzy integral methods to reduce the time series,and compare the three methods to find out the generalization ability for water.The dimensionality reduction model of the data,the time series data is set to a visible state sequence by the improved hidden Markov model,and the Viterbi algorithm is applied to predict the most likely hidden state sequence(normal or abnormal)for abnormal value detection,which can be effective.The accuracy of outlier detection is improved,which lays a foundation for further research on the smart water big data cleaning model based on stereo perception.(3)Research on smart water data big data cleaning algorithm based on stereo perception.On the basis of the research on the outlier detection of water data space and time characteristics,the application of the improved two-step clustering algorithm for the universal numerical attribute outlier detection method and the improved multi-layer perceptron artificial neural network vacancy value filling method are proposed.A smart water big data cleaning model based on stereo perception.Firstly,the data of the water service data to be cleaned is preprocessed,and then the general outlier detection algorithm,the spatial outlier detection algorithm and the time outlier detection algorithm are used to detect the outliers according to the attribute characteristics in the data,and then the outliers are detected.The vacancy values in the data are filled to obtain clean water data.The model is designed according to the characteristics of water data,the characteristics of water dirty data and the data cleaning process,which can effectively clean the dirty data appearing in the water sector.The experimental results show that the outlier detection based on stereoscopic perception of smart water big data cleaning model has good accuracy and effectiveness,and the vacancy value filling effect also has good accuracy and generalization ability,which indicates the feasibility of this method in data cleaning.Feasibility and effectiveness.
Keywords/Search Tags:data mining, data cleaning, spatiotemporal characteristics, outlier detection, vacancy filling
PDF Full Text Request
Related items