Font Size: a A A

Study On Several Problems In Fuzzy Clustering

Posted on:2010-11-23Degree:MasterType:Thesis
Country:ChinaCandidate:X Q HuFull Text:PDF
GTID:2120330338976527Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
As an important branch of unsupervised pattern recognition, clustering analysis has become an important tool in modern data analysis. Different starting points and criteria usually lead to different taxonomies of clustering algorithms. Therefore, clustering algorithms is a vast and diverse algorithm family. Various clustering algorithms have been proposed until now.Fuzzy c-Means (FCM), owing to its native simplicity and efficiency, is one of the most popular clustering algorithms. FCM, as well as a number of its extensions, has been applied to a wide range of clustering tasks with great success. What is common for all these suggested methods is that they require the elaborate iterations for computing optimal cluster means (i.e. centers). As a consequence, they are noticeably sensitive to the initial cluster centers and possibly existed noise. In addition, they tend to find only the pre-specified number of sphere-shaped clusters. However, in many situations, there may be no"true"representatives for cluster centers. In this paper, we hence propose a novel cluster-center-free reformulation of FCM that can handle arbitrarily shaped clusters. This is done by defining a novel fuzzy similarity function between every point and some cluster. This further allows us to deduce a meaningful cluster-validity index for determining an accurate number of clusters in the data.Hierarchical clustering is another most popular clustering algorithm. It ia able to deplay several partitioning results of the dataset, however, it is a problem that users obtain the most satisfactory classification from these partitioning results. As is known to all, each clustering result in hierarchical clustering corresponds to a fuzzyλ-level set. Therefore, selecting the optimal classification result is equivalent to the problem of the best thresholdλ* selection. In this paper, a new cluster-validity index is established using the similarity matrix of the dataset, according to compactness and separation. The new cluster-validity index is applied to determine the best threshold for hierarchical clustering method.The experimental results on both synthetic and real-world datasets have demonstrated the effectiveness of the new algorithm and the new cluster-validity indices. Keywords: fuzzy clustering,fuzzy c-means, hierarchical clustering, cluster centers, threshold, cluster-validity index...
Keywords/Search Tags:fuzzy clustering, fuzzy c-means, hierarchical clustering, cluster centers, threshold, cluster-validity index
PDF Full Text Request
Related items