| With the development of science and technology and the reform of the electricity system,people’s demand for electricity has been rising.At the same time,the rise of the Internet industry and social and economic development have also provided strong support for the sustained growth of electricity consumption.In recent years,large-scale electricity theft incidents have occurred from time to time,and the situation has developed from the initial direct and crude to intelligent equipment,specialized means,concealed behavior,and large-scale high-tech electricity theft.The occurrence of electricity theft not only caused huge economic losses to electricity companies and legitimate users,but also caused varying degrees of damage to the electricity grid.During the construction of smart grids,how to detect potential anomalous data in a timely and accurate manner and effectively identify the theft of electricity is the key to ensuring the security and high reliability of the grid.For the above background,the research work of this paper is mainly focused on the following aspects:1.This paper first summarizes the current data flow of smart grids and the phenomenon of electricity theft,and reviews the research progress on electricity theft detection at home and abroad in recent years.Based on the comparison of various methods,combined with the real electricity grid abnormal electricity consumption data set,all electricity users are divided into two types:long-span users and short-span users,then a reasonable process of data preprocessing and feature extraction is developed to generate a feature set with strong correlation with the category label.2.For detection objects with long-term electricity consumption records,this paper uses a hybrid algorithm based on XGBoost,Random Forest,and Logistics Regression to consider comprehensively the prediction results of the three classification models according to the electricity consumption behavior characteristics of such users in the past two years,give the final judgment on whether the user’s electricity consumption behavior is normal or not.The starting point of this hybrid model is that different sub-models have different sensitivities and processing capabilities for different data.By combining the discrimination results of each sub-model,a more comprehensive total model is obtained.After experimental verification and comparison of recent similar studies,the performance of this hybrid detection model on each evaluation index is very good,and it can better complete the task of screening such users for abnormality.3.For the detection objects that have only a short time span electricity consumption records(objects with a shorter account opening time),select the features generated by the recent month electricity consumption records of such users,and based on the abnormal point division of the One-Class SVM,the density measurement of Local Outlier Factor and the average electricity consumption of normal users predicted by LSTM are used to correct and adjust the misjudged normal users in One-Class SVM results.At present,the research on abnormal detection of electricity data in a short time span is relatively lacking,and the design scheme of this paper has certain innovation and rationality.After the test of actual data,this strategy,while retaining the characteristics of the high recall rate of One-Class SVM,has also significantly improved the accuracy of the overall judgment.4.During the training of the entire model,for the algorithm sub-modules with more complicated parameters,this paper adopts Bayesian optimization method to obtain the hyperparameters of the module under the training data.Compared with the traditional random search and grid search,this process refers to the previous parameter information during the iterative process,constantly updates the prior,the number of iterations is relatively small,and it still has good results for non-convex problems.Through experimental verification,Bayesian optimization not only limits the calculation time to a reasonable range,but also guarantees the quality of the final obtained hyperparameters.The electricity consumption data anomaly detection scheme based on the machine learning hybrid model designed in this paper can scientifically and accurately judge and classify based on the statistical data of electricity users,and provides effective technologies for reducing the operation and maintenance costs of electricity supply companies and eliminating potential electricity system hidden dangers stand by. |