Massive Spatial Data Storage And Management Based On Hadoop

Posted on:2018-09-11

Degree:Master

Type:Thesis

Country:China

Candidate:Q J Li

Full Text:PDF

GTID:2310330512982757

Subject:Cartography and Geographic Information System

Abstract/Summary:

PDF Full Text Request

With the rapid development of Geographic Information Industry,geospatial data as GIS blood are growing at an exponential rate,making the spatial data retrieval and calculation and analysis of the increasing difficulty.Besides,The application of GIS to the field of work is also expanding.What's more,demand for spatial data accuracy is increasing,leading to massive spatial data storage management difficult to upgrade.Thus,new methods and techniques are urgently needed to solve the problem.Hadoop distributed system open source technology has developed rapidly since 2005,which has two technologies,including HDFS and MapReduce.The technologies can provide technical support for the distributed storage and parallel computing of geospatial data,which provides a new way to solve the above problems.In this paper,we study the common geospatial data storage mode and data storage structure,by using HDFS and MapReduce,unstructured data structure is designed to storage massive geospatial data in this paper.At the same time,the unified data conversion interface is designed to store the geospatial data of different sources,different formats and different data structures in HDFS.At present,the research of spatial data index is mostly based on single machine,and there is little research on the distributed storage space index.The spatial data stored in HDFS is completely disordered and scattered,and the spatial data retrieval needs to be traversed in each node in the cluster,in order to retrieve the spatial data needed by the user.Therefore,according to the research data classification algorithm in-depth several different,to find out a suitable STR tree index for massive spatial data management,establish the STR tree spatial index mechanism of data partition-local index-Global Index.Through the parallel processing of MapReduce,the index is distributed computing,and the results are merged in the last Reduce stage to return to the user.Which greatly improve the efficiency of spatial data retrieval.

Keywords/Search Tags:

Massive spatial data, Distributed index, Cloud computing, R-tree index

PDF Full Text Request

Related items

1	Efficient Storage And Parallel Overlay Analysis Of Massive Vector Data In The Cloud Computing Environment
2	Research On The Mechanism Of Big Spatial Data Storage And Index In Cloud Computing Environment
3	Construction Market Gis Spatial Data Modeling And Realization
4	Research Of A Distributed Muliti-Layer R Tree Spatial Index Based On HDFS
5	A Study Of The Method Of Internal And External Organization Scheduling Of Massive Point Cloud Data
6	Massive Three-dimensional Point Cloud Management And Visualization Research
7	Optimization And Acceleration Of Spatiotemporal Ripley's K Function For Enabling Massive Point Pattern Analysis
8	Research On The Technology Of Efficient Mass Spatial Data Storage In The Cloud Computing Environment
9	Storage And Parallel Query Technology Research In Distributed Environments Massive Spatial Data
10	The Key Techniques Of Cloud GIS Based On Hadoop