Font Size: a A A

In-Memory Distributed Computing Based Approximate Query Processing On Spatial Data

Posted on:2018-01-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:A G QiuFull Text:PDF
GTID:1310330512986031Subject:Cartography and Geographic Information Engineering
Abstract/Summary:PDF Full Text Request
Interactive visualization of geospatial data and spatial analysis are importantfeatures of GIS applications.Nevertheless,existing geospatial database and geospatial data service standards(WCS,WFS,WMS,WMTS)with their implementations are deficient in supporting those features.The root of this problem is that query results of geo-features are exact and unique;execution time and result size of spatial query are determined by the geo-feature itself;geo-features can't be generated on-the-fly at query time.In fact,realistic GIS applications accept error-bounded approximate geo-features;execution time and result size need to be considered as constraints of spatial queries;geo-features should be generated on-the-fly according to the query conditions.We propose the multiresolution representation of geospatial data,introduce approximate spatial query processing approaches,which are established on distributed in-memory computing,node's tree with hierarchical structure,weighted breadth first search of tree,implement approximate spatial query engine(ASQE)with postgresql extension.We conduct extensive experiments on OpenStreetMap data to evaluate the proposed algorithms and data structures.Based on the engine,we develop internet GIS proto-system which enables interactive geospatial data visualization to verify the effectiveness and efficiency of our methods.The topics of this dissertation are as follows:(1)Theories of approximate spatial queries and distributed computingOn the basis of approximate query,approximate spatial queries are defined after stating the requirement of geospatial data interactive visualization and online spatial analysis,considering the problem that large amount of data resulted from spatial queries degrade the performance of data visualization and transmission.Multiresolution representation of geospatial data is presented in followed sections.It is built by steps of dataset recursive division,data sampling,data operation and error evaluation.To speed up the computing-intensive data preprocessing of recursive division and error evaluation,distributed in-memory computing is employed.(2)Nodes' hierarchical representation building methodThe tree structures,which are hierarchical representations of geo-feature's nodes,is built with the implementation of multiresolution representation of geo-features'polylines.The multiresolution representation is implemented by denoting the recursive subdivision factor to 2 and defining the error to Hausdorff distance.Based on the preprocessing framework of recursive subdivision and error evaluation,the algorithm of constructing hierarchical representation of nodes with distributed in-memory computing is presented along with the storage model of the hierarchical structure in relational database and spatial index construction considering node weight.(3)Algorithm of spatial approximate query processing Combined with windowing query conditions and error constraint or time/data volume constraint,a progressive pruning algorithm along with weighted breadth-first search in the tree are used to sample the nodes to get approximated features from the database.Spatial query conditions and approximate query conditions are used spontaneously in the query processing to speed up the tree traversal.(4)Dynamization of the hierarchical structure of nodesWith the continuous updating of geographic feature,dynamization of the hierarchical structure of nodes is a necessary.Locally updating algorithm by minimizing loss function is presented.The amortized time complexity of the algorithm which is O(log n)is analyzed.The impact of parameters in constructing the node's tree is also analyzed.(5)Empirical study with SAQE and coastline data from OpenStreeMapA PostgreSQL-extension is implemented as ASQE to process the spatial approximate query needed by WebGIS application with interactive visualization.The global coastline data extracted from OpenStreetMap,which occupies 30GB in disk and comprise of more than 40 millions nodes,are used to verify the effectiveness of the ASQE-based WebGIS architecture and efficiency of proposed algorithms in the ASQE.
Keywords/Search Tags:Geospatial Data preprocessing, Multi-resolution Hierarchy Algorithm, Hierarchical Representation of Nodes, Weighted Breadth-First Search, Approximate Spatial Query Engine, Interactive Visualization, OpenStreetMap
PDF Full Text Request
Related items