Font Size: a A A

Research On Clustering Analysis Of Geographic Location Information Based On Hadoop

Posted on:2022-03-28Degree:MasterType:Thesis
Country:ChinaCandidate:M Z MuFull Text:PDF
GTID:2480306728980689Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Geographic location information contains more and more complex data due to rapid development of social science and technology.It has not only included longitude and latitude coordinates as the basic data,but also contains a large number of multi-dimensional data.Traditional data mining can not solve complex data structure and content.Traditional clustering analysis is not suitable for mass data clustering.Therefore,spatial data mining has become popular domain of data mining.Optimized clustering method which applied to special-dataconstructed model can obtain more favorable data analysis results.This thesis describes a geographic location information analysis network model by location data mining based on traditional data mining.The model analyses data with node degree,node betweenness,clustering coefficient,node strength,site capacity and site longitude and latitude.In order to make the model more stable,the time is further weighted,and the weighting has a certain directivity,so that the intensity distribution index accords with the scale-free characteristic.The experiment evaluates the information network model from the distribution characteristics,operation efficiency and bearing pressure.Compared with the traditional data mining model,it ensures that the information network model has better stability.The traditional clustering analysis has some limitations on the amount of data and data dimensions,so it is unable to cluster the data on the basis of comprehensive consideration.In this thesis,a detailed clustering analysis experiment is carried out by combining the geographic location information network model.Kmeans clustering,spectral clustering,hierarchical clustering,Gaussian mixture model clustering and birch clustering are realized through information network model.Through the information network model,the kmeans clustering,spectral clustering,hierarchical clustering,Gaussian mixture model clustering and BIRCH clustering are realized,and the contour coefficients of the clustering results are performed under the condition of different classification numbers and different data volumes.Comparative analysis to verify the clustering effect.In order to improve the contour coefficient of clustering algorithm,a voting clustering model is proposed by optimizing the five algorithms and adjusting the parameters.By traversing all the selected nodes in other clustering algorithms,the cluster is determined according to the majority principle of voting,and the centroid of each cluster is obtained.Once again with the other five clustering algorithms for experimental comparison,the voting-based clustering method has better stability of contour coefficient with the increase of classification number and data volume,and the contour coefficient of clustering is obviously improved.
Keywords/Search Tags:Geographic location information, Spatial data mining, Vetwork model, Clustering analysis
PDF Full Text Request
Related items