Font Size: a A A

Research Of Fuzzy Clustering Algorithm For Incomplete Data Based On The Improved ACO With Interval Supervision

Posted on:2017-01-13Degree:MasterType:Thesis
Country:ChinaCandidate:L WangFull Text:PDF
GTID:2308330482499737Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the process of the fuzzy clustering, due to measurement error, missing observations or misunderstanding data and other reasons, many datasets suffer from incompleteness in practical applications. It seems obvious that the reconstruction of missing attribute values can be considered as the key factors impacting the clustering performance. That the estimations of missing attributes are as close as possible to the original real values can enhance the clustering algorithms’ accuracy, which is the ultimate goal during the clustering iteration.For the problem that the FCM algorithm can not be directly applied to incomplete data clustering analysis, our proposed algorithm makes the cluster prototypes and the missing attributes form a position vector of individual ant in this paper. Each encoded individual ant in the ant colony stands for a set of solutions of the missing attributes and the cluster centers. Moreover, the global optimization ability of ant colony optimization can be utilized to optimize the values of missing attributes and the cluster prototypes by using pheromone trail as a guider, and the membership degree matrix can be calculated according to the recovered incomplete data set and the cluster centers, and then the clustering results of the incomplete data sets can be obtained.With the global search of the ant colony optimization, the corresponding component of ant position vector, that is, the estimated values of missing attributes and the cluster prototypes may get out of the supervised interval, which may lead to the serious deviation from the real value. For this, the interval supervision strategy is proposed in this paper. In the iterative process, if the estimation of missing attribute gets out of its supervised interval during iteration, it will be forced to the mathematical expectation of nearest-neighbors. When one missing attribute estimation dissatisfies the interval constraint for some times, then this missing attribute will be fixed to the corresponding mathematical expectation in further iteration. For the interval supervision strategy of cluster prototypes, if the estimation of cluster prototype gets out of its interval constraint during iteration, it will be forced to the interval center of supervised interval.Finally, two synthetic gaussian data sets and Iris, Breast, Bupa data set at the UCI Repository are used in the simulation experiments. Experimental results show that the algorithm which makes the cluster prototypes and the missing attributes form a position vector of individual ant can obtain more accuracy clustering results, and the interval supervision strategy is adopted in this paper, which makes that the estimation of missing attributes and the cluster prototypes are as close as possible to the real values, and then can obtain more satisfying clustering results.
Keywords/Search Tags:fuzzy clustering, incomplete data sets, ant colony, hybrid optimization, interval supervision
PDF Full Text Request
Related items