As intelligent terminal acquisition equipment under the construction of the smart grid in China,the smart meter is not only an intelligent network sensor and controller,but also a two-way energy and information transmission gateway,which plays an important supporting role in the stable operation of the power grid.With the rapid development of the smart grid,the functions of smart meters are increasingly rich,but at the same time,it also brings the increasing number of fault classes.The accurate and timely prediction of fault classes of smart meters can improve the operation and maintenance efficiency and effectively reduce the cost of maintaining the stable operation of the power grid.The fault classes of smart meters are diverse,and the occurrence of faults is affected by many factors.The machine learning method is an effective technique to establish the accurate mapping relationship between these factors and fault types.However,the number of samples of different classes of fault meters is extremely imbalanced,and the fault data presents the characteristics of multi-mode distribution in the super-dimensional space,which brings great challenges to the accurate fault classification.Focusing on the multi-classification task of smart meter fault,aiming at the problem of decision bias caused by the imbalance of fault samples and the problem of difficult division of decision boundary caused by the overlapping of fault data in the feature space,this paper carries out the research on the related technology of data rebalance based on the one-tomany classification framework.The main works are as follows:Firstly,the rebalancing and classification method for smart meter fault samples under the condition of data multi-mode distribution is studied.Aiming at the problem of inconsistent distribution before and after data balance in the existing data balance methods,a global reliable fault sample generation method for imbalanced learning through latent vector reconstruction and feature repulsion is proposed.We propose a reconstruction technique of latent vector with mutual information constraints,and the latent vector of input sample are divided into key features’latent vector and subordinate features’latent vector.With the method of maintaining the key features’ latent vector and randomly replacing the subordinate features’ latent vector,we can obtain the mutated latent vector,and then through a series of operations,including decoder restoration,mutual information constraint and discriminant confrontation,the reliable similar mutated samples are generated according to the important features of the input sample.In this way,the "sample-level" data generation based on input sample is realized.On the basis of above work,the feature repulsion technique and the combination coding technique are proposed.The former carries out supervised feature representation learning through the constraint of maximizing the interval of the values of the key features’ latent vector for each dimension between different classes of samples,and the latter superimposes the reconstruction error of each dimension of the sample as a supplement for the latent vector of key features.They can solve the problem of difficulty in feature extraction and classification in the overlapping area of samples to a certain extent.Secondly,the rebalancing and classification method of smart meter fault samples under the condition of multi-class data over-lapping in feature space is studied.Aiming at the problem that it is difficult to learn the decision boundary caused by the overlapping of fault data,a sample generation method based on sample migration and boundary enhancement in overlapping areas is proposed.Different from the previous fault sample rebalancing methods that directly learn the distribution characteristics from the minority samples and generates them,this paper presents a sample overall balancing scheme that maps the majority samples to the boundary minority ones,which is more conducive to the classifier’s learning of class decision boundaries.Firstly,a cross-domain consistency constraint for the latent vector of variational autoencoder(VAE)is designed,and a crossclass sample generation network consisting of two pairs of VAEs sharing latent vector space is constructed.In addition,the generative adversarial mechanism promotes the network to better learn the commonalities and differences between samples of different classes.Based on this,a fault sample generation technique with the enhanced boundary of data overlapping area is proposed.By introducing Euclidean distance minimization constraint,new minority samples are generated,and these samples as well as the original majority samples form a clearer classification boundary in the feature space.Finally,the rebalancing and classification method of smart meter fault samples based on contrastive learning is studied.The existing sample generation methods cannot generate new information in essence.It is difficult to mine the differences in feature space between classes thoroughly,and random noise will inevitably be introduced.To solve the above problems,an ensemble contrastive classification framework with sample-neighbors pair construction is proposed.It redefines the traditional imbalanced classification as the label matching task between the sample to be classified and a group of similar samples.For any sample in the original training set,multiple random sampling is performed in its k-nearest neighbors pool to obtain multiple same/different comparison sample groups.These groups respectively combined with the original sample into sample-neighbors pairs as positive/negative samples in the new classification problem.In this way,the balance of classes and the multiple expansion of the samples are realized without introducing any noise.Based on the above,a robust classifier to predict the class of new samples can be trained.For a given test sample,we can obtain abundance sampleneighbors pairs by arbitrarily combining its corresponding comparison sample groups of different classes.By integrating the classifier’s results of pairs,the category of the test sample can be predicted through reverse reasoning.In this way,the classifier can fully mine the similarities and differences between the test sample and its similar neighbors in different classes,which improves the interpretability and rationality of the discrimination,and further obtains more accurate and robust classification results. |