Research On Clustering Method Based On Data Field

Posted on:2010-09-06

Degree:Master

Type:Thesis

Country:China

Candidate:F Guo

Full Text:PDF

GTID:2178360272479341

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the advent of the information age, production and collection of mass data leads to Information Explosion, and Data Mining has become hot research spot in Computer Science area. As an importance task and method for Data Mining, clustering analysis has a great impact on algorithmic efficiency and clustering quality, which is one of difficult problems in Computer Science area.As an important branch of clustering analysis, clustering algorithm based on density has a principal position because it is able to discover clustering of arbitrary shape and it can deal with noise data effectively. DBSCAN is a classic density-based method, and it has not only advantages of general density-based method, but also high speed. However, it has many disadvantages. For instance, cluster parameters are hard to choose; clustering quality is low when partition densities are not equal; the random choice of initial clustering object wastes time; field searching to all seed objects costs computer memory and time.To resolve disadvantages of DBSCAN, considering that data in data space is not independent but has influence with each other, the author combines cluster with the theory on data field to improve DBSCAN. The author put forward a new density-based clustering method based on Data Field (DFDBSCAN).The algorithm puts the interaction between material particles and the field methods into abstract data space, and improves DBSCAN algorithm for its inadequacies by using the relationship between data field power in the data space and data density distribution.The algorithm adopts dynamic strategy to calculate the clustering radius, and solves the problem of data misdistribution. At the same time, algorithm utilizes relationship between field potential and the density of data distribution to improve the choice of initial clustering object and seed objects. Therefore, same as time complexity of DBSCAN, DFDBSCAN ascends Clustering quality as well as Clustering efficiency. Thus, the algorithm efficiency has been improved to some extent and the algorithm does not only save time but also the memory resources. The algorithm based on mathematics and have theoretical basis. At the same time, the algorithm is verified by the experimental data.

Keywords/Search Tags:

clustering analysis, data field, DFDBSCAN, clustering quality

PDF Full Text Request

Related items

1	Research And Application Of New Methods In Symbolic Clustering
2	Research And Implementation Of Clustering Algorithm For Multidimensional Data Sets
3	Research On The Effectiveness Element Theory And Method Of Clustering Ensemble
4	Research On Dynamic Clustering And Incremental In Data Mining
5	Clustering Algorithms Analysis On Data Dimension
6	Study On Clustering Analysis And Clustering Result Evaluating Algorithms
7	The Research And Application Of Improved Data Competition Clustering Algorithm
8	Research On Data Clustering Algorithm In Wireless Sensor Networks
9	Clustering Fusion Algorithm And Its Application In Mobile Channel Management
10	The Research Of Grid-based Parallel Clustering Algorithm And Clustering For Data Stream