Font Size: a A A

Research On The Mechanism Of Big Spatial Data Storage And Index In Cloud Computing Environment

Posted on:2020-06-15Degree:MasterType:Thesis
Country:ChinaCandidate:X L LiFull Text:PDF
GTID:2370330575999056Subject:Cartography and Geographic Information System
Abstract/Summary:PDF Full Text Request
With the advent of the era of Big Data,the Spatial Data is growing at an unprecedented rate,and it has multi-source,multi-scale,multi-temporal,global coverage and high-resolution features.The wide application of GIS in life makes the number of users more and more Large,real-time space retrieval and acquisition is increasing,and the spatial data as the "blood" of GIS,the traditional spatial data management method can notsuit the needs of users.The development of cloud computing has made Hadoop and MapReduce fully beapplicable to the problem of parallel access and processing of large Spatial Data.They are especially suitable for parallel access and processing of massive Spatial Data.Therefore,for the characteristics of the Spatial Data's,such asbig amount of data,the existence of topological and semantic relations,frequent updates,etc.,It is necessary to use Hadoop platform to design a reasonable spatial data structure for massive spatial data storage,construct an efficient index.This paper uses Cloud Computing platform—Hadoop,distributed database HBase,and distributed computing model MapReduce to study the storage indexing mechanism of massive spatial data.Taking OSM spatial data as an example,completed the following three as pects:(1)Analyze the structure and characteristics of OSM spatial data,and design spatial data storage model and incremental data organization method for the management requireme nts of massive spatial data in cloud computing environment.At the same time,in order to ensure the integrity of geographic feature geometry and topological relationship,we need tudy the data copy placement strategy suitable for spatial data after studying the default copy placement strategy of HDFS.(2)For the massive spatial data,in order to solve the problem of uneven distribution of spatial data and guarantee the adjacentness of spatial data,compare and analyze the common spatial data partitioning strategy,propose a spatial data partitioning strategy based on STR tree,and at the same time,in order to improve the SpaceData indexing efficiencyan d analytical performance,using MapReduce technology for parallel partitioning.(3)Introduce the traditional spatial indexing mechanism and analyze its advantages and disadvantages.For the partitioned spatial data,use the R-tree to construct the local index according to the distribution of geographic entity elements in a bottom-up manner,and then The STR tree constructs a global index,and stores the information of the local index and the global index on the DataNode and the NameNode,respectively,thereby improving the efficiency of spatial data retrieval.Finally,the Hadoop distributed environment is deployed,and the OSM data is taken as an example for testing and analysis.The storage and query performance of spatial data in the case of different data volume and number of cluster nodes are compared,and the s-torage and indexing of spatial data in a distributed environment is verified.Both have good performance and can meet the needs of spatial data storage and retrieval.
Keywords/Search Tags:Cloud Computing, Hadoop, Spatial Data, Data Storage, Spatial Index
PDF Full Text Request
Related items