Research On Heterogeneous Data Clustering Algorithm

Posted on:2011-01-27

Degree:Master

Type:Thesis

Country:China

Candidate:X Li

Full Text:PDF

GTID:2178330332962692

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

The data cluster is an important branch in data mining.At present the data cluster algorithms we have are majority confined to deal with the data which only have the continual attributes;moreover a few algorithms are confined to deal with the data which only have the nominal attributes.If we only deal with one kind of attribute, data message must lose in conditions of mix attributes, effect to the quality of data mining. How to carry on the cluster of mix attributes is still a challenging domain at present.The main research work in this article include following several aspects:1.First introduced the K-prototypes algorithm, and then proposed 2 kinds improvement means aim at the K-prototypes, the first kind is a new algorithm named CVAD(Categorical Value Attributes Decompose), which based on K-prototypes algorithm and the fuzzy K-prototypes algorithm, this method can overcome the insufficiency of original method, and can work out better cluster result. The second kind is a kind of improvement algorithm based on chooses initial points by group, which based on K-prototypes algorithm,and made the further improvement to choose groups through the genetic algorithm.2.Proposed a kind of mix attributes cluster algorithm based on the BIRCH algorithm; the algorithm proposed in the article has good performance that indicated in the UCI data set experiment.3.Proposed a kind of mix attributes cluster algorithm based on the improvement DBSCAN algorithm,gave the correlation description, and pointed out the merit of this algorithm.4.Proposed a kind of mix attributes cluster algorithm based on the cluster fusion (Cluster Ensemble-based Mixed Attribute Cluster, CEMC),introduce the cluster fusion method system into the mix attributes data cluster problem, promoted the cluster fusion method, carried on the mix attributes data cluster problem with the cluster fusion theory, established the algorithm frame, proposed the objective function and the algorithm,tested the algorithm validity in the actual data.

Keywords/Search Tags:

Data mining, Cluster, Mix attribute, BIRCH algorithm, DBSCAN algorithm, Data fusion

PDF Full Text Request

Related items

1	The Study Of Application And Analysis About Clustering Algorithm In Data Mining
2	The Research Of Grouping Algorithm Of Users Of Chinese Platform Based On Personal Information By BIRCH Clustering
3	Data Mining, Cluster Analysis Algorithm Research And Application
4	The Research And Design Of Personal Internet Bank System Based On Data Mining
5	The Application Of Improved DBSCAN On DBMAS
6	Study On Data Partition DBSCAN Using Genetic Algorithm
7	The Research And Application On User Access Patterns Found Based On BIRCH Algorithm
8	Study On Data Fusion System Based On COW And Amclioration Of Fusion Algorithm
9	Research On Parallization Of DBSCAN Clustering Algorithm For Spatial Data Mining Based On Spark Platform
10	Based On The Improved Clustering Algorithm In The Research And Implementation Of Data Mining System