Font Size: a A A

High Dimensional Data Stream Anomaly Monitoring Model Based On WAMCUSUM

Posted on:2022-10-21Degree:MasterType:Thesis
Country:ChinaCandidate:X J LiuFull Text:PDF
GTID:2480306545986289Subject:Mathematics
Abstract/Summary:PDF Full Text Request
With the increasing expansion of production scale,the amount of data generated in the production process is more and more,and the dimension is higher and higher.The purpose of statistical process control is to effectively use the quality characteristics of the out-ofcontrol state to identify the possible abnormal situation.However,in the case of highdimensional data flow,due to the influence of irrelevant information in the original data,the monitoring effect of the traditional process control chart is general,and it cannot identify the abnormal in time.In order to improve the monitoring efficiency,this paper improves the WAMCUSUM control chart,and proposes a high-dimensional data flow anomaly monitoring model based on WAMCUSUM chart.The core idea of the model is to introduce the principal component analysis and support vector machine on the basis of WAMCUSUM chart,which can reduce the dimension on the premise of retaining the effective information as much as possible,so as to improve the monitoring efficiency.This paper proposes three kinds of high-dimensional data stream anomaly monitoring models based on WAMCUSUM chart.The one is the WAMCUSUM chart model based on principal component analysis,which is called PCA-WAMCUSUM chart for short.At the same time,PCA-WAPMCUSUM chart is improved to get PCA mix-WAPMCUSUM chart.The other is the WAMCUSUM chart model based on support vector machine,which is called SVM-WAMCUSUM chart for short.In the process of model building,the dimension reduction of data is realized by principal component analysis and support vector machine,and then the control limit and WAMCUSUM chart statistics are calculated according to the reduced data set,and the control chart is drawn to realize the control process.Through the simulation study,the average running length ARL of the proposed model is observed to evaluate the performance of the model.Firstly,a simple high-dimensional data stream is generated by simulation,and the two models are compared with the existing control chart based on Bayesian method.The results show that SVM-WAMCUSUM chart can send alarm signals faster and has better monitoring effect.Secondly,in order to further analyze the performance of the model,high-dimensional data streams with different quality characteristics are generated.Compared with the existing two kinds of graphs based on PCA mix,the ARL value of PCA mix-WAPMCUSUM chart in out-of-control state is smaller,and the monitoring performance is better.Finally,the semiconductor production process data is used for empirical analysis to compare the average operation length of different control charts under out-of-control observations.Compared with WAMCUSUM chart,PCAWAMCUSUM and SVM-WAMCUSUM chart have faster cumulative shift information speed,earlier alarm time and better monitoring performance.In addition,SVMWAMCUSUM chart needs to obtain the category information of the original data in advance,while PCA-WAMCUSUM and PCA mix-WAPMCUSUM chart can directly monitor the original data without knowing the category information.In a word,PCA-WAMCUSUM,PCA mix-WAPMCUSUM and SVM-WAMCUSUM chart can better monitor the abnormal shift in high-dimensional data stream,and can quickly start the out-of-control alarm when the fault occurs,so as to reduce the waste of resources in the actual production process.
Keywords/Search Tags:High dimensional data steam, PCA, SVM, PCA mix method, WAMCUSUM chart
PDF Full Text Request
Related items