Font Size: a A A

Research On Parallelization Of Large Scale Raster Data Spatial Analysis Algorithm Based On MapRedcue

Posted on:2014-05-02Degree:MasterType:Thesis
Country:ChinaCandidate:W Q YangFull Text:PDF
GTID:2270330467488826Subject:Surveying and Mapping project
Abstract/Summary:PDF Full Text Request
With the rapid development of Earth observation technologies, raster data has increaseddramatically, the traditional single-node architecture of GIS systems have been unable to meetthe needs of process and analysis of large-scale raster data, how to improve the capability oflarge-scale raster data processing under the distributed parallel environment has become the focusin the field of geo-scientific research. the raster data spatial analysis has the characteristics oflarge amount of data and calculation, which belongs to the typical data intensive computing.theindustry put forward a variety of parallel computing model, Compared with the traditional MPIparallel programming model, open source MapReduce parallel programming model under theframework of Hadoop is more suitable for the data-intensive computing, at the same time withhigher performance. In this article, we mainly to solve the problem of low efficiency oflarge-scale grid data calculation by combining MapReduce parallel programming model withraster spatial analysis typical algorithms.In this paper, data partitioning, data parallel import and the results fusion are analyzed fromthe perspective of large-scale raster data parallelism and on the basis of spatial analysis of rasterdata parallel algorithms is designed. We mainly done the following tasks: First, according to thecharacteristics of large-scale raster data,this paper puts forward to build efficient dataorganization model by using the distributed file system HDFS under the Hadoop framework, andin view of the problem of the grid data processing algorithm in the neighborhood boundary,putting forward the raster data block handling mechanism; Second, Aiming at the problem oftraditional serial data reading speed to design grid Pyramid parallel construction based onMapRedcue and realize the parallel import of large-scale grid data; and then, combined with theMapReduce parallel programming model, the parallel algorithm of the basic terrain factors isdesigned based on neighborhood type of topographic factors and topographic feature extraction toimprove the efficiency of large-scale raster data spatial analysis; Finally, compared experimentswas done with serial algorithm to verify the efficiency of parallel algorithm of raster dataspatial analysis. The results show that the parallel algorithm of raster spatial analysis based onMapReduce is better. Meanwhile, along with the increase of data nodes and data volume, theefficiency of the parallel algorithm gradually increased.Therefore, the design of parallel algorithm of MapReduce-based raster data spatial analysisimproves the calculation efficiency of large-scale grid data effectively.
Keywords/Search Tags:MapReduce, Raster data, Spatial analysis, Parallel computing
PDF Full Text Request
Related items