| Nowadays, different industries (particularly in the public utilities), has accumulated a large amount of data. But the analysis of the data has become an urgent problem.Development of clustering technology, greatly ease this situation. It aims to find hidden structure and identify groups of similar behavior in given dataset. In the field of pattern recognition, spectral clustering has become the focus of academic research. It is a clustering method developed in recent years, and with spectral graph theory, to get the optimal clustering results by calculating the optimal graph partitioning. Compared to the traditional clustering algorithms, the spectral clustering algorithms are applied to solve the clustering of non-convex sphere of sample spaces, do not suffer from the problem of local optima. Besides,spectral clustering algorithms are not so sensitive to irregular data and gains superior performances. However, the clustering effect by using such spectral method depends heavily on the similarity measure. So it’s very important for the performance of spectral clustering algorithm to design a good similarity measure.Firstly, the knowledge and methods of spectral clustering are described in this paper.Secondly, this paper proposes the shared nearest neighbor-based similarity measure which is based on the analysis of existing similarity measures and combined the clustering consistency prior knowledge. Thirdly, through introducing it into spectral clustering, spectral clustering algorithm based on shared nearest neighbor is got. Finally, in order to test the proposed algorithm for practicality and effectiveness, a number of experiments on two artificial data sets, four UCI data sets are carried out. We choose classic spectral clustering algorithms as comparison tests. Experimental results show that the performance of the proposed algorithm is much better than classic spectral clustering algorithms. Besides, it can be less sensitive to parameter. In order to verify the ability of the proposed algorithm in solving practical problems, we select Chinese text clustering as a background and a comparison test is carried out between proposed algorithm and K-Means. Experimental results show that the proposed algorithm is more suitable for Chinese text clustering than K-Means. |