| With the rapid development of power industry,a lot of data have been accumulated.These data mainly come from power generation,transmission,power transmission,distribution,dispatching and electricity consumption.Outlier detection is an important data mining technique,by analyzing the abnormal data in the dataset,we can discover the value which hiding in the data.Abnormal data detection plays an important role in power system.For smart distribution network,using effective outlier detection methods can influence the power quality in a timely manner all kinds of abnormal state diagnosis,find out the cause of the power quality disturbances,to prevent failure,thus reducing the loss of the grid.For equipment monitoring,abnormal detection can help check the operation status of the equipment and ensure the stable operation of the equipment.For intelligent electricity system,outlier detection can improve the service level of the power grid,effectively save a lot of human resources,reduce operating costs,and make the power grid more economical.The traditional method can not meet the requirements of massive data mining,and the method based on machine learning has developed rapidly on big data in recent years.So this article introduces the development of electric power industry data and the background and significance of anomaly detection,according to the process of data analysis,this paper introduces the data cleaning,data conversion and data dimension reduction of some of the steps and methods.According to different analysis methods,we studied how to turn these methods improvement,make its can be applied to anomaly detection,including the method based on probability,the method based on machine learning and frequent pattern mining algorithm.Based on the machine learning,we focus on the method based on the linear model,the proximity-based outlier detection and the outlier ensembles.For local outlier detection,based on the combination of clustering and LOF algorithm can accurately find local anomalies,but the complexity of the LOF is higher,can’t make a anomaly detection,rapid higher requirements for real time application,this will be the bottleneck.The isolation forests,with the linear complexity,but also has accurate anomaly detection effect,so in this paper,we combine of clustering algorithm and isolation forest for outlier detection.Usually we Combine LOF with K-Means,but K-Means does not work efficiently,so we use Affinity Propagation algorithm,the Affinity Propagation algorithm not only has good clustering effect,and the algorithm is better than K-Means,Affinity Propagation has better features.One class of data in power data contains a large number of category characteristics(labels),The alarm data belongs to it.It is difficult to use the above methods for outlier detection of such data.So we improved fp-tree so that it could be used for frequent alarm mining of alarm data.We use actual data to analyze the outlier and show the results of the result.Finally,we look forward to the future development direction,as a new exploration idea. |