Font Size: a A A

Research On Power Load Data Mining Based On Dimensionality Reduction And Clustering Techniques

Posted on:2021-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:X S HuangFull Text:PDF
GTID:2392330611982781Subject:Control engineering
Abstract/Summary:PDF Full Text Request
With the development of power network towards intelligence,the volume of power load data explosively grows and the characteristic quantity extends to highdimensional.As an important technical means of data mining,clustering analysis can mine the inherent pattern information of power load data and then extract users’ behavior habits on the basis of low computing overhead by combining with dimensionality reduction technology.It has important application value in energy efficiency management and abnormal user detection on the supply and demand side.Aiming at the information mining work of dimensionality reduction technique and clustering analysis in power load data,the following research has been done in this thesis.(1)The power load experimental data set is obtained from the American open energy information website(Open EI),and four typical clustering algorithms of K-means,DBSCAN,BIRCH and GMM are compared on the experimental set,and compared in different data set sizes and different cluster numbers.The clustering accuracy and clustering speed are analyzed,and DBI is used as the clustering effectiveness evaluation index.It is concluded that the comprehensive clustering ability of partition clustering algorithm K-means is better than that of the other three algorithms.(2)K-means algorithm is selected to research and improve.(1)For the problem that the number of clusters needs to be determined in advance,the GSA elbow criterion is introduced to find the global optimal number of clusters.(2)When initial clustering centres are randomly selected,It is easy to cause some problems,such as algorithm instability,convergence to local optimization and so on.Pillar K-means(PK-means)algorithm including the optimal selection rules of initial clustering centres is proposed to use.By the experiments,the validity of GSA elbow criterion is proved,and the clustering accuracy and iterative speed of PK-means are better than those of K-means.Through detecting outliers in part of the power load profiles in the experimental set by using PK-means,it can be found that the high-quality clustering effect of PK-means can help to effectively detect power load outliers.(3)Compared the dimensionality reduction and clustering results of five common dimensionality reduction algorithms,namely,Principal Component Analysis(PCA),Kernel Principal Component Analysis(KPCA),Locally Linear Embedding(LLE),Multidimensional Scaling(MDS)and Isometric Feature Mapping(ISOMAP),under different dimensional conditions,it can be found that the linear dimensionality reduction algorithm is not suitable for the experimental set of this paper,and KPCA that is the fastest dimensionality reduction algorithm and ISOMAP that has higher dimensionality reduction and clustering accuracy are selected to combine with PK-means.The experimental and verified results show that KPCA+PK-means and ISOMAP+PK-means improve the clustering speed and accuracy respectively compared with PK-means,indicating that the combination of dimensionality reduction technique can improve the ability of clustering analysis algorithm to a certain extent.
Keywords/Search Tags:Power load data, Clustering analysis, Dimensionality reduction technique, GSA elbow criterion, PK-means algorithm
PDF Full Text Request
Related items