| With the development of the times,time series data plays an increasingly important role in the era of big data.In the previous research on time series data in the fields of finance and meteorology,the characteristics of time series data related to it have been studied relatively well,and statistical methods can be used to effectively analyze and process these time series data.However,in recent years,industrial time series data has attracted more and more attention,which has led to many new problems in the processing of time series data.When dealing with quality problems on high-dimensional time series data,time series processing methods in a single dimension are no longer applicable because of the inability to correctly and effectively analyze the relationship between highdimensional data.Research on high-dimensional time series data in industry is nearly blank.Based on the above background,this paper focuses on high-dimensional time series data on industrial big data,and carries out problem analysis,algorithm development,system development and so on.This paper proposes two problems in the highdimensional time series data.They are "staggered column detection and correction of high-dimensional time series data" and "value anomaly detection and correction of highdimensional time series data".The main completed algorithm research and system development content is as follows:(1)For the problem of staggered detection and correction,this paper proposes an online and offline algorithm.Through no prior knowledge or only a small amount of prior knowledge,the detection and correction of the abnormality of the industrial highdimensional time series data is completed by three steps,mode determination,anomaly detection and comprehensive cleaning.The algorithm can eliminate the interference and the small-range floating,and accurately locate the abnormal interval.(2)For the problem of value anomaly detection and correction,this paper divides it into two parts: "Detection and correction of single-dimensional time series value anomaly" and "multi-dimensional auxiliary high-dimensional time series value anomaly detection and correction".In the single-dimensional anomaly detection correction part,this paper uses the more traditional statistical methods and LSTM neural network to detect the abnormal points and abnormal intervals respectively.In the multi-dimensional auxiliary detection correction part,this paper proposes a running time with linear growth of data volume.The algorithm completes the multi-dimensional auxiliary highdimensional time series data detection and correction through four steps of correlation analysis,solving confidence interval,determining abnormal items and performing correction.(3)Based on the above algorithm,this paper develops a high-dimensional time series cleaning system Cleanits for industrial big data,which realizes various functions including data cleaning,visualization of cleaning results and measurement analysis. |