| In recent years,with the steady progress of power system reform,various thermal power plants are actively pushing forward the overall deepening of reform and developing an information and intelligent thermal power unit information system to lay the foundation for the further development of smart power plants.At present,China’s thermal power industry is running at a high speed,and a steady stream of data is being produced.How to identify and repair the bad data has become a hot issue that people care about.Therefore,this paper discusses how to identify and repair bad data of thermal power units quickly and accurately from the following three aspects.Aiming at the problems of poor universality,lack of reference for identification results and low identification accuracy of the existing identification methods for bad data of thermal power units,this paper proposes an identification method for bad data HC-GSA.First of all,this paper optimizes the partitioning algorithm around the center point by using the clustering algorithm of condensed hierarchy and the model evaluation method of unknown real Index,and proposes HC-Center clustering algorithm for calculating historical normal data.Secondly,this paper proposes to use gap statistics algorithm to compare the results of data operation to be measured with the results of normal data using HC-Center clustering algorithm,and to realize the identification of bad data based on the value of class mean square deviation.Finally,experiments prove that the method proposed in this paper can effectively identify the bad data of thermal power units,and has higher identification accuracy compared with the same type of algorithm.In view of the limitation of the existing thermal power unit bad data repair methods that cannot accurately repair continuous missing data or a large number of bad data,this paper proposes a bad data repair method based on similar state method and SA-PSO-RBF repair algorithm.Firstly,in order to ensure that the repaired data can approach the actual value to the greatest extent,this paper proposes a similar state method for selecting data sets similar to the state immediately before the occurrence of bad data.The purpose is to set up a training sample set for subsequent neural network algorithms.Secondly,this paper proposes a SA-PSO-RBF repair algorithm which uses the improved particle swarm optimization algorithm of simulated annealing algorithm to optimize the width parameters and weight parameters of the radial basis function neural network and combines the center parameters calculated by the division algorithm around the center point for the radial basis function neural network to construct the neural network prediction model.This algorithm is mainly used to predict the values at the bad data,and uses the predicted values close to the actual values to replace the identified bad data,thus achieving the effect of repairing the bad data.Finally,experimental verification shows that the method proposed in this paper can repair multiple discontinuous bad data or continuous missing data more accurately,and has obvious advantages compared with the prediction accuracy of the same type of algorithm.Aiming at the problems of insufficient single node computing resources,low computing efficiency and long time-consuming when the amount of data to be measured is too large or training samples are too large,the identification and repair method of bad data HC-GSA and SA-PSO-RBF repair algorithm are proposed for parallel processing.First of all,this paper establishes parallelization schemes for HC-Center clustering algorithm and gap statistics algorithm respectively to realize parallelization design of HC-GSA identification method for bad data.Secondly,this paper designs the parts of SA-PSO-RBF repair algorithm that consume a lot of resources and time in parallel.Finally,experiments verify that the parallel design of HC-GSA identification method and SA-PSO-RBF repair algorithm proposed in this paper have good accuracy and high computational efficiency. |