Font Size: a A A

Research On Key Technologies Of Spatial Vector Big Data Storage Model And High Performance Analysis In Distributed Environment

Posted on:2024-02-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:L L SunFull Text:PDF
GTID:1520307121482634Subject:Cartography and Geographic Information System
Abstract/Summary:PDF Full Text Request
In the era of big data,data is a national basic strategic resource,and gradually become the fifth factor of production.The ubiquity of data shows in many fields and boosts the economy.With the development of traditional 3S technology,the Internet,data collection,and cloud computing technology,the volume of spatial big data is growing rapidly,and its applications in various fields grow deeper.The development of the digital economy also puts forward higher requirements for the flow of data elements.Research and construction of an efficient storage model of spatial vector big data in a distributed environment,and construction of efficient Spatial analysis and visualization key technologies matching with it are the basic technical prerequisites for forming a highly scalable and high-performance geographic information service system.However,existing research focuses on building spatial vector data storage models based on a single data model and storage platform,which cannot provide support for the storage and management of spatial vector data with multiple data granularity;Meanwhile,due to the differences between distributed storage environments and traditional single machine storage,there is still significant optimization space for the distributed indexing mechanism and analysis processing mode of existing models,and efficiency needs to be further improved;In addition,these storage models are built for one of the Spatial analysis functions,such as query overlay,proximity analysis,or rapid visualization.If you want to support multiple applications,you will have to bear multiple times of storage costs.In response to the above problems,this paper,guided by the key technologies of efficient analysis and visualization of spatial big data with multimodality and data analysis granularity in the distributed environment,discusses and studies the spatial vector big data storage model in the distributed environment,especially the key technologies of spatial data storage model,spatial query processing,overlay analysis processing,proximity analysis processing and vector tile rapid construction,A set of method system for getting through the multi-source storage,analysis and fast dynamic visualization of space vector big data is proposed,and its feasibility and superiority are verified by experiments;At the same time,taking the natural resources big data management as the application field,we designed and implemented the natural resources big data analysis prototype system under the distributed environment,and verified the application value and practical significance of the research results of this paper through the actual natural resources management,planning analysis and other applications involving space vector big data.The innovative achievements and main contributions of this paper include the following four aspects:(1)To meet the requirements of spatial analysis in different computing modes and data granularity,this paper proposes the Fabric Geostore,a distributed storage model for spatial vector big data.Firstly,the theoretical basis and key technology of space vector data storage were summarized.Secondly,according to the different storage and read characteristics of the key-value data model and column-oriented data model,a distributed storage model of space vector big data based on HDFS and HBase was designed and implemented according to the object-oriented storage model idea,and spatial metadata management was provided for efficient data organization and screening.The Fabric Geostore storage model was tested with Open Street Map(OSM)data sets for Asia in 2022,and its excellent data loading performance,storage resource utilization,and data reading performance were verified.(2)Based on the Fabric Geostore storage model,we conducted studies on highperformance spatial analysis methods for spatial vector data and proposed an elastic spatial analysis processing method that supports both online and offline computing modes,it also supports different data granularities including files,blocks,rows,and columns.We proposed a server-side in-memory distributed R*-tree(SIR*-tree)spatial index and the index loading and spatial-analysis processing framework to accelerate the efficiency.Firstly,for the spatial vector data stored in the HDFS,we used Apache Spark,a memory parallel computing framework,for offline spatial analysis processing such as spatial queries,overlay analysis,and proximity analysis.Secondly,we proposed an agile,distributed spatial processing mechanism on HBase server-side coprocessing framework,which supports low-latency parallel processing for spatial queries,overlay analysis,and proximity analysis.Using the Asia OSM data set in 2022,we tested spatial query,overlay analysis,and proximity analysis in online and offline computing modes,and verified the excellent performance and scalability of our proposed methods.(3)Based on the proposed online spatial-analysis processing method,we extended the Fabric Geostore to support the rapid visualization of massive spatial vector data,specifically,for the dynamically generated vector tiles in a distributed environment.Firstly,we analyzed the theoretical basis and key technologies for fast SVD visualization.Secondly,by evaluating the necessary information and redundant information for visualization at different display levels,we proposed a data-based and pixel-based hybrid visualization model,and it also combines geometry generalization technology and vector tile encoding technology to realize the rapid and dynamic visualization of the SVD.The method directly uses the same SVD that is stored and indexed in Fabric Geostore on HBase without an additional data version.Moreover,the generated vector tiles can be saved in Fabric Geostore HBase tables,for distributed tile cache service to save storage costs.OSM data sets in China in 2017 were used to conduct comparative tests on our proposed method,QGIS,and Arc GIS Server.The experimental results verified the superiority of our proposed method in performance for SVD visualization.Our method can provide real-time dynamic visualization of SVD on display devices under 1080 p and 2k resolution.(4)Taking spatial big data management in the department of natural resources as the application scenario,we implemented a prototype system based on Fabric Geostore and the proposed spatial analysis and SVD visualization methods.We introduced the development and deployment environment of the system and selected three application cases: land-use spatial compliance review(LUCR),buffer analysis,and land-use transfer matrix analysis.We elaborated the implementations of these three applications with the Fabric Geostore and conducted performance tests for them with real-world land-use data sets.This study showed high application value and practical significance and provided a high-performance solution for massive SVD management,analysis,and visualization of productive applications.
Keywords/Search Tags:Spatial vector big data, Spatial data model, Spatial analysis, Visualization, Vector tile, Distributed geographic information system
PDF Full Text Request
Related items