| With the rapid development of intelligent and information technology in smart power grid,the electricity consumption data detected by the power grid company every day is enormous,which brings new challenge to the detection of electricity theft.Electric theft directly affects the revenue and development of electric power companies,so improving the accuracy of electric theft detection has become an important research topic.The main research contents of this paper are as follows:Firstly,the overall missing situation of electricity theft data was analyzed.For users with a data missing degree of less than 30%,this article uses the missing forest algorithm to interpolate the missing data.Then,aiming at the highly unbalanced problem of the data set composed of normal and power stealing users,this paper proposes a oversampling technique(BC-SMOTE)based on K-means clustering optimization for boundary clustering synthetic minority samples,so that the newly generated data can be concentrated on the sample boundary to facilitate the classification of classifiers,And it can make the algorithm stay away from normal samples during the synthesis process of new stealing electricity samples,avoiding the introduction of new noise.This method has been experimentally validated on a publicly available dataset of the State Grid of China,demonstrating its effectiveness and superiority.Secondly,the data after oversampling is cleaned through the Edited Nearest Neighbors(ENN)and Local Outlier Factor(LOF)algorithms.ENN is responsible for cleaning the noise of the normal samples in the power theft samples,and LOF algorithm is responsible for cleaning the noise of the power theft samples in the normal samples,and then extract the features of the cleaned data through Tsfresh,After calculating the correlation coefficient using the Maximum mutual information Coefficient(MIC),select the features with strong correlation.Then,considering the frequent neglect of temporal features in power theft detection models,a universal time network(Times Net)sensitive to temporal features is used to more effectively utilize the extracted temporal and other features through transformation and adaptive fusion,achieving more accurate detection of user power theft.Finally,through simulation and comparison experiments,BC-SMOTE is compared with other oversampling algorithms under multiple classifiers such as decision tree,Knearest neighbor and random forest,which proves the effectiveness and stability of BCSMOTE.Comparing the evaluation indicators before and after ENN and LOF cleaning,it is proven that the data after ENN and LOF cleaning has better results.Comparing the Times Net before and after feature extraction and filtering with multiple detection models horizontally and vertically,the results showed that Times Net not only has stronger classification ability,but also further improves the detection performance of Times Net after feature extraction and filtering,proving the accuracy and effectiveness of this method. |