Font Size: a A A

Fuzzy Clustering And Its Applied Research In The Chinese Text Clustering

Posted on:2007-01-09Degree:MasterType:Thesis
Country:ChinaCandidate:C H DuFull Text:PDF
GTID:2190360185476973Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
Document clustering is to separate the document set into some clusters,in which documents are of the most topic-related, and between which documents are of the most topic-unrelated. The purpose of document clustering is to provide the convenience for information retrieval, pattern discovery, and also the preparation for categorizing the new coming documents. With the rapid growth of the information resources on Internet, it has become more and more important for document automatic clustering to search information on Internet.The thesis summarizes systematically techniques of Chinese document automatic clustering. Vector Space Model, which is used to represent document, and its building process are introduced. On the basis of presenting the basic concepts in fuzzy set theory and fuzzy clustering analysis, this thesis studies document clustering with fuzzy clustering analysis. Method of transitive closure based on equivalence relation, method of maximum fuzzy spanning tree based on fuzzy graph, and algorithm Fuzzy C-Means (FCM) based on partition are more deeply investigated. Respectively, this thesis proposes an algorithm(ATCFC) for document fuzzy clustering based on methed of transitive closure and an algorithm(ATCMT) for document fuzzy clustering based on methed of maximum fuzzy spanning tree. Moreover, FCM is particularly studied from data standardization method, metrics method, and selection method of initial clustering prototype. Local optimality and initialization dependence disadvantage of FCM is analyzed and an algorithm (PSO-FCM) for document fuzzy clustering based on particle swarm optimization algorithm is proposed. Algorithm PSO-FCM adopts real code for clustering prototype;global searching of particle swarm optimization is used to instruct to choose clustering prototype;and then clustering analysis is processed by FCM. A prototype system of Chinese document automatic clustering based on fuzzy clustering is implemented to verify the validity of fuzzy clustering algorithms those proposed above, and the experimental results show that the prototype system is effective.
Keywords/Search Tags:Fuzzy clustering, Document clustering, Vector space model, Particle swarm optimization
PDF Full Text Request
Related items