| With the gradual expansion of China’s power grid scale,the data structure in the power grid has become more and more complex.Although it brings significant advantages to the power grid,it also brings some risks to the power grid.Some local faults in the power grid,such as short circuit,machine cut,etc.,if they are not dealt with in time,they will become larger accidents in the power grid.Therefore,it is very important to predict and classify the disturbances on-line accurately and timely.With the application and development of computer technology in China’s intelligent distribution network,the distributed phasor measurement device D-PMU,which is cost-effective and suitable for fault location,has been widely used in China.A large number of D-PMU devices with high-frequency data flow have been added,which makes the measurement data in the distribution system grow explosively.The demand of big data processing technology is higher and higher,so we need to optimize and innovate the current data processing technology to improve the timeliness of data.According to the above background and the requirements of the national key research and development program "research and application of wide area measurement and control technology in distribution network",it is necessary to realize efficient analysis of massive D-PMU flow data and accurate classification and prediction of possible disturbances.The main work of this paper includes:(1)Based on spark streaming,a stream computing framework of spark distributed computing engine,this paper designs and implements a pre analytical compression method for D-PMU stream data of multiple sockets to improve the efficiency of parsing and reduce the delay,and designs a distributed architecture to achieve efficient and safe disaster recovery,data storage and backup.(2)Based on the multi-dimensional characteristics of D-PMU flow data,a method of feature extraction of D-PMU time series based on PCA(principal component analysis)algorithm is proposed,which can effectively remove a large number of redundant features of D-PMU time series and reduce the calculation cost of classification algorithm.(3)Based on the situation that there may be disturbance or equipment failure in the power grid,this paper proposes a method to predict and classify the current disturbance of the power grid based on XGBoost algorithm,which can accurately and timely predict the current state of D-PMU sub station system,and provides a basis for determining the reasonable time between equipment and line maintenance.The simulation test results show that the pre analytic compression method based on spark reduces the delay to about 200 ms and improves the efficiency of the analysis;the feature extraction method based on PCA reduces the calculation cost of the classification algorithm;the method based on XGBoost algorithm for power system disturbance classification is proposed,the accuracy is up to 97.23% and the time of disturbance classification is effectively reduced,It can realize high-efficient real-time disturbance classification. |