Font Size: a A A

Massive Spatial Data Storage And Management Based On Hadoop

Posted on:2018-09-11Degree:MasterType:Thesis
Country:ChinaCandidate:Q J LiFull Text:PDF
GTID:2310330512982757Subject:Cartography and Geographic Information System
Abstract/Summary:PDF Full Text Request
With the rapid development of Geographic Information Industry,geospatial data as GIS blood are growing at an exponential rate,making the spatial data retrieval and calculation and analysis of the increasing difficulty.Besides,The application of GIS to the field of work is also expanding.What's more,demand for spatial data accuracy is increasing,leading to massive spatial data storage management difficult to upgrade.Thus,new methods and techniques are urgently needed to solve the problem.Hadoop distributed system open source technology has developed rapidly since 2005,which has two technologies,including HDFS and MapReduce.The technologies can provide technical support for the distributed storage and parallel computing of geospatial data,which provides a new way to solve the above problems.In this paper,we study the common geospatial data storage mode and data storage structure,by using HDFS and MapReduce,unstructured data structure is designed to storage massive geospatial data in this paper.At the same time,the unified data conversion interface is designed to store the geospatial data of different sources,different formats and different data structures in HDFS.At present,the research of spatial data index is mostly based on single machine,and there is little research on the distributed storage space index.The spatial data stored in HDFS is completely disordered and scattered,and the spatial data retrieval needs to be traversed in each node in the cluster,in order to retrieve the spatial data needed by the user.Therefore,according to the research data classification algorithm in-depth several different,to find out a suitable STR tree index for massive spatial data management,establish the STR tree spatial index mechanism of data partition-local index-Global Index.Through the parallel processing of MapReduce,the index is distributed computing,and the results are merged in the last Reduce stage to return to the user.Which greatly improve the efficiency of spatial data retrieval.
Keywords/Search Tags:Massive spatial data, Distributed index, Cloud computing, R-tree index
PDF Full Text Request
Related items