Font Size: a A A

Application Research Of EM Algorithm Based On Fuzzy Theory In Clustering Analysis

Posted on:2016-02-23Degree:MasterType:Thesis
Country:ChinaCandidate:X B FengFull Text:PDF
GTID:2308330461994876Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Clustering analysis is an important content of data mining, machine learning, and other fields, there are many kinds of clustering algorithms, any of them have advantages and disadvantages. How to make the clustering algorithms to adapt to the complex application scenarios, and make the clustering results more accurate and stable is crucial, we should not only choose the appropriate clustering algorithm according to the practical problem, but also improve the algorithm properly, and foster strengths and circumvent weaknesses. On the other hand, as the evaluation and decision method become more and more scientific, the evaluative language data of natural language exists in many practical problems of the original data, it brings clustering difficulties that this kind of data cannot be processed by computer and traditional algorithms.In this dissertation, the common limit multi-component and multidimensional Gaussian mixture probability distribution model is studied, the EM clustering algorithm which is simple and stable and has a good processing capacity to the model is chosen as the core, and the thought and the essence of EM algorithm is analyzed. Aiming at the condition of practical application in the sample may contain a large number of the abnormal data, EM algorithm is improved combining with fuzzy mathematics theory, it makes the algorithm more able to eliminate the abnormal data, improves the clustering accuracy, also controls sensitivity of the abnormal data by using a threshold parameter, and improves the practicability of the algorithm.Based on natural valuation language in the original data, I make fuzzy processing to the evaluation language by triangular fuzzy number according to Fuzzy Theory, convert the original evaluation data to fuzzy data that retains semantic features according to the fuzzy method by triangular fuzzy number, so that it can be studied by fuzzy mathematics. Finally I convert the fuzzy data to the accurate data through a de-fuzzy method, thus it can cluster by the improved EM algorithm and help people make decisions in the actual problem.In this dissertation, an algorithm program is wrote based on the improved EM algorithm with Fuzzy Theory, three instances are respectively used to verify the capacity of eliminating the abnormal data and the ability to handle the fuzzy data by the algorithm. Using the same initialization parameters method, I find the fuzzy EM algorithm has better effect on distinguishing abnormal data in the sample comparing the results of fuzzy EM algorithm with EM algorithm by using the instance with abnormal data, and abnormal data has the weak influence to the algorithm. I also find fuzzy EM algorithm has the correct clustering fruit comparing the results of fuzzy EM algorithm with fuzzy clustering method by using the instance with evaluation language data, and it proves the well handling to the fuzzy data.
Keywords/Search Tags:EM Algorithm, Fuzzy Theory, Triangular Fuzzy Number, Clustering Analysis
PDF Full Text Request
Related items