Font Size: a A A

Distributed Storage And Spatio-temporal Query Of Massive Vector Objects

Posted on:2020-04-01Degree:MasterType:Thesis
Country:ChinaCandidate:C XieFull Text:PDF
GTID:2370330590476773Subject:Photogrammetry and Remote Sensing
Abstract/Summary:PDF Full Text Request
With the rapid development of earth observation networks,wireless Internet,and ubiquitous Internet of Things technologies,massive spatial-temporal location data(such as vehicle trajectories,personal trajectories,social media data,check-in data,etc.)has been accumulated in an explosive way.This type of spatial-temporal data exhibit the characteristics of streaming dynamics,multi-dimension,large volume and sparsity.The available mature GISs cannot cope with the severe challenges brought by spatio-temporal big data,and now distributed file systems and databases have been employed to manage these data.Distributed file system can efficiently process offline batch tasks of big data,but it cannot provide low-latency queries.Distributed NoSQL database,e.g.HBase,can suit high-concurrent read/write,and support basic queries in the form of key-value pairs.Therefore,many research have been conducted to extend multi-dimensional query and improve the performance of multi-dimensional query.Several research only supports the storage and query of point data objects,and does not support more complex line and polygon vector data.Some research can perform spatio-temporal query for point and line vector data at the same time,but the efficiency of range query and k-NN query still need further improvement and optimization.To address the above-mentioned problems,this paper proposes a complete storage organization,spatiotemporal index,query and retrieval scheme based on the distributed database HBase.Firstly,according to the characteristics of the distributed database HBase,three types of HBase table are designed for the vector object dataset,including metadata table,coding table and data table.The metadata table mainly records the meta information of the data set,such as the coding schema,the vector type,the spatial reference.The coding table is used for the spatiotemporal range query,k-NN query;while the data table is used to support the attribute query of the spatiotemporal vector object.Based on the concept of Spatio-temporal Cube,we proposes a spatiotemporal concatenated coding index with the help of dimension reduction by the space filling curve.This paper also proposes a multi-class spatiotemporal query algorithm and a k-NN query algorithm that takes into account the data distribution,it can expand the search according to the size of the data distribution control grid.Taking the New York taxi data as an example,the following three comparative experiments were carried out.In the experiment on the maximum number of recursions in space division,it was verified by experiments that the query performance was best when the maximum number of recursions was 6 to 8.In the performance comparison experiment,TS coding is better than other coding during each query scenario.In the comparison experiment of k-NN query,the query time in the dense parts is much longer than that in sparse point area.All the experiments validate that the system can store a large volume of spatiotemporal vector data objects,and also can support low-latency spatio-temporal range query and k-NN query.
Keywords/Search Tags:Big data, Distributed database, Spatial-temporal index, k-NN Query
PDF Full Text Request
Related items