Research On Distributed Computing Of Raster Big Data Based On GeoTrellis

Posted on:2021-05-30

Degree:Master

Type:Thesis

Country:China

Candidate:H K Xiang

Full Text:PDF

GTID:2480306473976489

Subject:Surveying the science and technology

Abstract/Summary:

In recent years,with the development of raster data acquisition technology in the field of geospatial information,the raster data acquired has exploded,and the sharp contrast with the powerful data acquisition ability is the low data processing ability,which is extremely Has limited people’s access to information and knowledge from massive raster data.How to efficiently process massive raster data has become an urgent problem in the field of geospatial information.Distributed technology provides ideas for solving this problem.This paper proposes a distributed computing idea combining GeoTrellis geographic computing engine with Hadoop and Spark to solve the processing of massive raster data.The integration of GeoTrellis with Hadoop and Spark technology can make up for the disadvantages of distributed technology in the processing of massive raster data and ensure the massive The effectiveness and efficiency of distributed storage and calculation of raster data.In this paper,a four-tier distributed architecture for massive grid data processing based on GeoTrellis is designed,and a distributed computing and storage system for massive raster data is designed and implemented based on the four-tier distributed architecture.At the same time,based on the GeoTrellis geoprocessing engine,this paper designs and implements a Web test system for massive raster data processing with B / S architecture.The test system can perform distributed computing on the global surface coverage classification map with a resolution of 10 meters.The system has a grid Grid data storage,pyramid construction,grid rendering,surface coverage classification statistics and other server-side functions,the browser side has the function of grid data rendering and user interaction.At the end of the article,we conducted a relevant test on the computing performance of the distributed system.During the test,we discussed the size of the distributed cluster,the amount of raster data,and the impact of the hardware resources on the system performance in the distributed cluster with different grid computing applications Issues such as scalability and stability of distributed systems,and put forward relevant suggestions for building a massive grid data distributed system based on GeoTrellis geoprocessing engine.

Keywords/Search Tags:

massive raster data, distributed computing, Hadoop, Spark, Geotrellis

Related items

1	Massive Spatial Data Storage And Management Based On Hadoop
2	Research On Physical Marine Big Data Cloud Computing Technology Based On Spark
3	Distributed Parallel Computing Environment Of Gml Spatial Data Partitioning Strategy And Algorithm Research
4	Integration And Development Of Natural Resource Spatial Data Application Platform Based On Hadoop
5	The Research To Massive Terrain Data Processing Method Based On Cloud Computing
6	The Research For Key Technology Of Astronomy Big Data Integration Based On Spark
7	The Key Techniques Of Cloud GIS Based On Hadoop
8	A Research On Distributed Logistics Optimization Algorithm Based On Spark
9	Research And Implementation Of Seismic Big Data Parallel Processing System Based On Spark
10	Storage And Parallel Query Technology Research In Distributed Environments Massive Spatial Data