| Clustering is one of the most important techniques for exploratory data analysis,with applications ranging from statistics,computer science,bioinformatics.Many researchers have proposed an enormous variety of clustering methods.Among these methods,spectral clustering,a series of methods based on the eigendecomposition on Laplacian matrix,always achieves superior empirical performance.Compared to many clustering methods highly tied to Euclidean geometry,spectral clustering has no limitations on the shape of data and can detect linearly non-separable pattern.Despite these virtues,spectral clustering has less applicability to large scale data set because of its high time complexity.Generally,spectral clustering usually needs to construct an affnity matrix to model the similarities between all pairs of the data points.Then compute the first k eigenvectors of the corresponding Laplacian matrix,which is also called spectral embedding.Finally,run k-means on each row of first k eigenvectors.The first two steps is usually the bottleneck of the whole method.First,we focus on how to improve the accuracy of landmark representation and tackle the bottleneck of the eigendecomposition.The existing landmark-based spectral clustering methods use Nadaraya-Watson method to represent the original data with some selected landmarks.As a kernel regression method,Nadaraya-Watson methods is lack of meaningful interpretations and sensitive to brand-width parameter.In some cases,such representation can usually result in poor approximation.We propose a landmark-based spectral clustering with locality-constrained linear coding.The basic ideas of the proposed method are from two aspects.First,each data point can be well represented by some selected landmarks with locality-constrained linear coding.Second,spectral embedding can be largely accelerated by constructing a proper affinity matrix.Extensive experiments show the effectiveness and effciency of the proposed method comparing to other spectral clustering methods.Next,the existing landmark-based spectral clustering methods do not utilize the prior knowledge embedded in a given similarity function.To address the aforementioned challenges,a landmark-based spectral clustering method with local similarity representation is proposed.The proposed methodfirstly encodes the original data points with their most ?imilar?andmarks by using a given similarity function.Then the proposed method performs singular value decomposition on the encoded data points to get the spectral embedded data points.Finally run k-means on the embedded data points to get the clustering results.Extensive experiments show the effectiveness and effciency of the proposed method.Finally,we focus on how to select landmarks and tackle the bottleneck of which k-means is sensitive to the shape of data.To address the aforementioned challenges,a landmark-based spectral clustering method with heuristic landmark selection is proposed.The proposed method firstly generate affinity matrix of all the data points with a given similarity function.Secondly,generate the minimum spanning tree of the affinity matrix and the data points which have the most degrees are selected as the landmarks.Thirdly,encodes the original data points with their most ?imilar?andmarks by using a given similarity function.Next the proposed method performs singular value decomposition on the encoded data points to get the spectral embedded data points.Finally run k-means on the embedded data points to get the clustering results.Extensive experiments show the effectiveness and effciency of the proposed method. |