| Electricity theft is a significant issue faced by utility companies worldwide.It is estimated to cause losses of billions of dollars annually.Traditional methods of electricity theft detection are not only time-consuming but also often ineffective.Therefore,a more efficient and accurate approach is needed to detect instances of electricity theft.With the advancement of technology,machine learning methods have been employed for electricity theft detection.By analyzing and learning from data collected by smart meters,machine learning classification algorithms can identify patterns associated with electricity theft,thus improving the accuracy and efficiency of detection.However,the datasets used for electricity theft detection are often imbalanced,which poses challenges in training and evaluating theft detection models,leading to high false-positive rates,difficulties in model performance evaluation,and poor generalization capabilities.In this paper,we address the issues present in current electricity theft detection,such as imbalanced datasets,poor detection performance,long detection times,and suboptimal performance on long sequences.We propose the ADASYN-ENN hybrid sampling algorithm and the RACTEC model based on convolutional kernel transformation for electricity theft detection by power consumers.(1)We design a nearest neighbor-based hybrid sampling algorithm that combines the ADASYN oversampling algorithm with the ENN undersampling algorithm.The ADASYN algorithm is employed to oversample electricity consumption data samples from theft users,ensuring that the number of theft samples after sampling is 50%of the normal samples.Subsequently,the improved ENN algorithm with a strict exclusion strategy is used to undersample normal user consumption data samples to remove outliers and noise.During the algorithm execution,we ensure consistent nearest neighbor parameters,resulting in a balanced dataset with approximately equal numbers of normal and theft samples.(2)We propose a power consumer electricity theft detection model based on convolutional kernel transformation.The model consists of a rolling average module,a convolutional kernel transformation module,and a classification module.The rolling average module eliminates fluctuations within data samples,injecting diversity into the convolutional kernel transformation module.The convolutional kernel transformation module extracts temporal features from electricity consumption data and passes them to the classification module for learning,thereby enhancing the detection performance.The classification module leverages the Bagging approach to learn aggregated features from the convolutional kernel transformation.integrating output detection results and improving model generalization capabilities.(3)We execute the hybrid sampling algorithm on real-world datasets to generate balanced datasets.Comparative experiments are conducted on the balanced datasets with multiple benchmark theft detection models.The experimental results demonstrate that the proposed model exhibits better overall performance,shorter runtime,and higher efficiency with the same computational resources.Additionally,we design experiments to verify the effectiveness of the model on electricity consumption data of different lengths.The results indicate that the model performs better on long sequences and can effectively accomplish electricity theft detection tasks. |