Font Size: a A A

Differential Evolution Algorithm And Its Application In Clustering Analysis

Posted on:2011-07-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:W Y GongFull Text:PDF
GTID:1100360308975264Subject:Geographic Information System
Abstract/Summary:PDF Full Text Request
Global optimization is a common problem in almost every field, such as engineer design, operational research, bio-medical science, data mining, and so on. How to design an efficient algorithm to solve this problem is essential in science and real applications. Recently, using Evolutionary Algorithms (EAs) to solve the global optimization problem is more popular. Since EAs are self-adaptive, self-organized, parallel, using EAs for global optimization is reasonable and available.Differential evolution (DE), which is an evolutionary algorithm, is a simple, fast, and efficient population-based direct search algorithm for global optimization. The DE algorithm has been widely used in many fields. Among its advantages are its simple structure, ease of use, robustness, and fast convergence. However, the original DE algorithm has some pitfalls:1) DE is good at exploring the search space and locating the region of global minimum, but it is slow at the exploitation of the solution; 2) The parameters of DE are problem dependent and the choice of them is often critical for the performance of DE; 3) There are many strategies in the DE literature, however, choosing the best among different mutation strategies available for DE is also not easy for a specific problem.Clustering analysis is an important technique for the data analysis in data mining and machine learning. Clustering is the unsupervised classification of objects (patterns) into different groups, or more precisely, the partitioning of a dataset into subsets (clusters), so that the data in the same clusters is similar and the data in different clusters is dissimilar according to some defined distance measure. Data clustering is a common technique for statistical data analysis, which is used in many fields, including machine learning, data mining, pattern recognition, image analysis, and bioinformatics. Most of the clustering algorithm can be converted into a global optimization problem. From the optimization point of view, clustering is an NP-hard problem. Due to the complexity of clustering, most of the traditional clustering algorithm is a local optimal algorithm. In order to solve the clustering problem more effectively, one possible way is the hybridization of EAs with the traditional clustering algorithm, so that the evolutionary clustering algorithm is able to enhance the robustness and efficiency of the traditional clustering algorithm. However, there are two main drawbacks in the evolutionary clustering algorithm:1) In EAs, there are many control parameters, such as crossover probability and mutation probability. These parameters are problem-dependent and sensitive. Hence, this drawback may degrade the commonality of the evolutionary clustering algorithm.2) The execution time of EAs is significantly higher than that of other traditional clustering algorithms (e.g. K-means and FCM), especially when applied to large datasets.In this dissertation, firstly, I described the advances of the DE algorithm. Then, to remedy some of DE's pitfalls, I proposed two variants of DE. Thirdly, theε-domination based orthogonal DE algorithm was proposed for the multi-objective optimization problems and the engineering design problems. Finally, the DE algorithm based on the point symmetry metric was proposed for the clustering problem. The main contributions of this dissertation are as follows.(1) Biogeography-Based Optimization (BBO) is a new biogeography inspired algorithm. It mainly uses the biogeography-based migration operator to share the information among solutions. To enhance the exploitation ability of DE, a hybrid DE with BBO, namely DE/BBO, is proposed for the global numerical optimization problem. DE/BBO combines the exploration of DE with the exploitation of BBO effectively, and hence it can generate the promising candidate solutions. To verify the performance of the proposed DE/BBO approach,23 benchmark functions with a wide range of dimensions and diverse complexities are employed. Experimental results indicate that the proposed approach is effective and efficient. Compared with other state-of-the-art DE approaches, DE/BBO performs better, or at least comparably, in terms of the quality of the final solutions and the convergence rate. In addition, the influence of the population size, dimensionality, different mutation schemes, and the self-adaptive control parameters of DE are also studied.(2) The fusion of multi-methods is an efficient technique to enhance the performance of the single algorithm. As the above-mentioned, the choice of the best mutation strategy is difficult for a specific problem. To alleviate this drawback and enhance the performance of DE, a family of improved DE that attempts to adaptively choose the suitable strategies for different problems is presented. In addition, in the proposed strategy adaptation mechanism (SaM) different parameter adaptation methods of DE can be used for different strategies. In order to test the efficiency of our approach, the proposed SaM is combined with JADE, which is a recently proposed DE variant, for numerical optimization. Twenty widely used scalable benchmark problems are chosen from the literature as the test suit. Experimental results verify the expectation that SaM is able to adaptively choose the suitable strategy for a specific problem without any prior knowledge. Compared with other state-of-the-art DE variants, the proposed approach performs better, or at least comparably, in terms of the quality of the final solutions and the convergence rate.(3) Evolutionary multi-objective optimization (EMO) has become a very popular topic in the last few years. However, to design an efficient and effective EMO algorithm to find the near-optimal and near-complete Pareto front is a challenging task. A novel differential evolution algorithm,ε-ODEMO, is proposed to solve multi-objective optimization problems (MOPs) efficiently. The proposed approach uses an Archive population to retain the obtained non-dominated solutions; also it adopts the orthogonal design method with quantization technique to generate an initial population of points that are scattered uniformly over the feasible solution space, so that the algorithm can evenly scan the feasible solution space once to locate good points for further exploration in subsequent iterations. Moreover, it is based on theε-dominance concept to obtain a good distribution of Pareto-optimal solutions and gets them in a small computational time. To make the algorithm converge faster, the new approach employs a hybrid selection mechanism in which a random selection and an elitist selection are interleaved. Experiments on eight benchmark problems of diverse complexities show that the proposed approach is able to obtain a good distribution in all cases. Compared with several other state-of-the-art evolutionary algorithms, it achieves not only comparable results in terms of convergence and diversity metrics, but also a considerable reduction of the computational effort. Furthermore, the discussion of the influence of different CR value and the parameter value of hybrid selection mechanism to the performance of the algorithm is conducted experimentally.(4) Solving engineering design and resources optimization via multi-objective evolutionary algorithms (MOEAs) has attracted much attention in the last few years. An efficient multi-objective differential evolution algorithm is presented for engineering design. The approach is the extension of s-ODEMO; and it has some modifications:1) An archive (or secondary population) is employed to keep the non-dominated solutions found and it is updated by a new relaxed form of Pareto dominance, called Pareto-adaptiveε-dominance (paε-dominance), at each generation.2) To handle the constraints, a new constraint-handling method is employed, which does not need any parameters to be tuned for constraint handling. The proposed approach is tested on four benchmark-constrained problems to illustrate the capabilities of the algorithm in handling mathematically complex problems. Furthermore, four well-studied engineering design optimization problems are solved to illustrate the efficiency and applicability of the algorithm for multi-objective design optimization. Compared with NSGA-II, one of the best MOEAs available at present, the results demonstrate that my approach is found to be statistically competitive. Moreover, the proposed approach is very efficient and is capable of yielding a wide spread of solutions with good coverage and convergence to true Pareto-optimal fronts.(5) To remedy some of the drawbacks of the evolutionary clustering algorithm, a DE-based clustering approach (DEPS-C), which combines several features of previous clustering techniques in a unique manner, is presented. It is characterized by:(a) employing a recently proposed point symmetry-based distance measure as the similarity measure, (b) incorporating the closure property when assigning a data point to a cluster, and (c) using the Kd-tree based nearest neighbor search to reduce the complexity of finding the closest symmetric point. Experiments have been conducted on 10 artificial data sets and 8 real-life data sets of diverse complexities. The results indicate that DEPS-C is suitable for both the symmetrical intra-clusters and the symmetrical inter-clusters. Compared with GAPS, a recently proposed genetic algorithm with point symmetry distance based clustering algorithm, the proposed approach performs better, or at least comparably, in terms of the quality and stability of the final solutions. Moreover, DEPS-C is faster than GAPS with respect to the execution time.(6) Based on the DEPS-C algorithm, an automatic clustering DE technique (ACDEPS) for the clustering problem is proposed. This approach can be characterized by:(ⅰ) proposing a modified point symmetry-based cluster validity index (CVI) as a measure of the validity of the corresponding partitioning, (ⅱ) using the Kd-tree based nearest neighbor search to reduce the complexity of finding the closest symmetric point, and (ⅲ) employing a new representation to represent an individual. Experiments have been conducted on 6 artificial data sets of diverse complexities. The results demonstrate that ACDEPS is suitable for both the symmetrical intra-clusters and the symmetrical inter-clusters. In addition, ACDEPS is able to find the optimal number of clusters of the dataset. Furthermore, based on the comparison with the original point symmetry-based CVI, the modified point symmetry-based CVI shows better performance in terms of the F-measure and the number of clusters found.In summery, based on the analysis of the original DE algorithm and its variants, this dissertation pointed out the drawbacks of DE. Meanwhile, I described the key techniques of the evolutionary clustering algorithms and analyzed their pitfalls. Based on the above-mentioned analysis, this dissertation is mainly studied:the BBO-based hybrid DE algorithm, DE with the adaptive multi-strategies selection,ε-domination based orthogonal DE for MOPs, paε-domination based constrainedε-ODEMO for engineering design, DE-based clustering algorithm with predefined K, and automatic DE-based clustering algorithm. In addition, for each improved algorithms, the experimental study was conducted to validate its performance.
Keywords/Search Tags:Differential evolution, global optimization, clustering analysis, adaptation, multi-objective optimization
PDF Full Text Request
Related items