Font Size: a A A

Distributed Storage Of Remote Sensing Data Based On Hadoop

Posted on:2019-05-26Degree:MasterType:Thesis
Country:ChinaCandidate:C LiFull Text:PDF
GTID:2310330545986160Subject:Land Resource Management
Abstract/Summary:PDF Full Text Request
With the rapid development of remote sensing technology,the amount of remote sensing data with multi-sensor,multi-temporal,high spatial resolution,and high spectral resolution is also increasing,and the data types are becoming more and more complex.The traditional stand-alone centralized data storage method cannot meet the requirements for efficient and safe storage of massive remote sensing data.Therefore,it is necessary to explore a new type of remote sensing data storage method to solve a series of problems brought about by the expansion of remote sensing data.The distributed data storage technology based on the Hadoop platform provides a solution for efficient and secure storage of heterogeneous remote sensing data.Remote sensing data was divided into two types: structured data and unstructured data.The structured remote sensing data in this study were near-surface non-imaging hyperspectral data and attribute data;unstructured data were satellite remote sensing image data.This study designed data storage models for both data types.Among the structured data nodes,a structured and relational SQL database developed based on Microsoft Visual Studio 2010 and SQL Server 2008 was deployed to store near-field non-imaging hyperspectral data and attribute data;in the Hadoop cluster,the Hadoop Distributed File System(HDFS)stored satellite remote sensing image data blocks separated by certain rules.In order to avoid unnecessary waste of resources in the process of building the system,a pseudo distributed Hadoop cluster was built in a virtual environment in advance,and on the basis of this,a satellite remote sensing image data storage model and a system network topology were designed.In order to ensure the safe operation of the system,a Web server was introduced between the system and the user as a bridge to achieve physical isolation between the user and the system.User commands and system data were forwarded through the Web server.At the same time,system security mechanisms such as dual power supplies,dual hard disks,RAID 1,and user rights control were set up on different levels.The distributed remote sensing data storage system including Hadoop clusters,Web servers,and structured data storage nodes was built in a gigabit switching network,realizing the distributed storage of remote sensing data.The stability and efficiency of the system were verified through the design experiments.The following conclusions were drawn.(1)Using Microsoft Visual Studio 2010 and SQL Server 2008 to develop,a structured data management system of C/S mode was designed and deployed in the structured data storage node in the cluster,which could store the two kind of structured data(near ground non imaging hyperspectral data and attribute data),then realizing the separation of unstructured data to reduce the overhead of the system.(2)Using virtual machine software,the traditional PC could be used as the hardware basis to virtualize several Hadoop cluster nodes.Through the network configuration,the pseudo distributed architecture of Hadoop clusters could be implemented.And this could be used as an experimental platform to design a distributed storage model and network topology of satellite remote sensing image data based on Hadoop.(3)The distributed storage of the large satellite remote sensing image data could be realized in the Hadoop cluster by the way of recursion four equal division number of satellite remote sensing images,and the distributed storage model of tree shaped satellite remote sensing image data of "image name-band name-image block name" could be formed.(4)A distributed Hadoop cluster,Web server and distributed data node remote sensing data distributed storage system,which included 1 Namenode nodes and 13 Datanode nodes,was built.The network topology,business process and security mechanism were designed to achieve the distributed,efficient and secure storage of massive heterogeneous remote sensing data.LandSat-8 satellite remote sensing image data with a size of 885 MB was used as test data to test the data upload/download speed of single-machine data server and upload/download speed of distributed system data.The average upload speed of the distributed system was 64.52 MB/s,and the average upload speed of the single-machine data server was 24.48 MB/s.The data uploading efficiency of the distributed system was higher than the data uploading efficiency of the single-machine data server,and the uploading speed was more stable.The average download speed of the distributed storage system was 71.38 MB/s,and the average download speed of the stand-alone data server was 31.52 MB/s.The data download efficiency of a distributed system was higher than that of a stand-alone data server.Using distributed system can achieve efficient and stable storage of massive heterogeneous remote sensing data,thus providing a guarantee for more comprehensive and accurate analysis of remote sensing data.
Keywords/Search Tags:Remote Sensing Data, Distributed Storage, Hadoop, Virtualization, Data Se curity
PDF Full Text Request
Related items