Research On Clustering Algorithms For Incomplete Data

Posted on:2018-10-06

Degree:Master

Type:Thesis

Country:China

Candidate:H L Sang

Full Text:PDF

GTID:2348330536981925

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Since the twenty-first century,the connection between human and human,the human and the physical world has become more and more close.In this case,the generation of data is everywhere.However,in the data scale is almost explosive growth at the same time,the data quality has not been a corresponding upgrade,can not get enough protection.Because the data in the initial acquisition and exchange and dissemination of the process,there may be a variety of conditions so that we finally get the quality of data problems.However,commonly used clustering algorithms usually require high quality data to be used normally,but when the quality of large data problems,such methods are usually poor performance.It is often necessary to use the data cleaning technology to the quality of the data before the first cleaning,and then such as clustering data mining operations.But data cleaning on large-scale data often has a very expensive time overhead,and the final cleaning effect may not be as good as people wish;that we spend a lot of time on the data cleaning,the final data may still be unable to clear the quality Problem,that is to say,the final cleaning result does not significantly improve the quality of the data mining results.Therefore,the study of clustering operations directly on weakly available data provides a new way to solve this problem,that is,we do not clean up the data directly for clustering operations or perform clustering operations without clean data.This article focuses on how to perform clustering analysis on an incomplete set of data.First,this paper analyzes the spatial structure of incomplete data,thus understanding the impact of incomplete data on clustering operations.In this paper,an incomplete clustering algorithm based on fuzzy clustering is designed.The incomplete data clustering algorithm based on fuzzy clustering regards the missing in the data as the optimization variable in the clustering iterative process and is updated continuously in the iterative process Solve the completion of incomplete data clustering.Based on the incomplete data clustering algorithm,the two core requirements in the clustering process are described.The cluster center in the cluster must be the point where the density of the surrounding points is large,and the points with other points The distance between as far as possible,after determining the cluster center and then according to a certain strategy to other points into the current cluster to go.The incomplete data clustering algorithm based on information theory regards the clustering process as a process of changing the uncertainty of the cluster.With the addition of attributes,the uncertainty of a record category is reduced,and finally we can Which is divided into the cluster with the least uncertainty.For the incomplete data,we need to estimate the basic parameters of the information theory and the information parameters of the cluster.Through the combination of the two,we can complete the clustering operation of incomplete data.At the end of each algorithm design,this paper carries on the experimental analysis to the algorithm through the related experiment.

Keywords/Search Tags:

incomplete data, clustering, fuzzy clustering, density based clustering, information theory

PDF Full Text Request

Related items

1	Research On Three-Way Clustering Method For Incomplete Data
2	Research On Blocking Fuzzy Clustering Algorithm Based On Density Of Samples
3	Research Of Fuzzy Clustering Algorithm For Incomplete Data Based On Improved BP Imputation
4	Research Of Fuzzy Clustering Algorithm For Optimizing Incomplete Data Based On Extreme Learning Machine
5	A Study Of Large-scale Data Clustering Based On Fuzzy Clustering And Its Application
6	Research Of Fuzzy Clustering Algorithm For Incomplete Data Based On Interval Analysis
7	A Improved Density Peaks Clustering Algorithm
8	The Application Of Granular Computing In Clustering Analysis
9	Research On Clustering Algorithms For Multi-density Distribution And Noisy Data
10	Researches On Fuzzy Clustering Methods Based On Information Granules