Font Size: a A A

Research On The Technology Of Efficient Mass Spatial Data Storage In The Cloud Computing Environment

Posted on:2013-10-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z F TuFull Text:PDF
GTID:1220330395975879Subject:Photogrammetry and Remote Sensing
Abstract/Summary:PDF Full Text Request
With the development of the economy and technology, spatial data has been not only applied to the specialized field, but also more and more widely used in Land, resources, environment, transport, planning, etc. The building of "Smarter Planet"," Smarter China"and " Smarter City" greatly expand the range of applications of space data.Due to the rapid development of earth observation technology,the spatial data acquisition means are increasingly rich, spatial data show a geometric increase as well. Massive spatial data puts forward a higher demand to the capacity, performance, availability and expansibility of the storage system. Most of the existing spatial database are built on top of the relational database,and with the increase of data the problem of poor expandability, low concurrent read and write ability, the difficulty in data structure change has emerged, which are not capable to meet the current application needs. In view of this, this paper puts forward the idea of using cloud computing, especially NoSQL database technology, to construct the massive spatial data storage.To be detailed, the main research works are as follows:(1) For the requirements of capacity, expansibility of massive spatial data service, a scalable massive spatial data storage service architecture and a service model have been designed. The Massive spatial data storage service system is constructed in a hierarchical structure, with each layer being built by the distributed cluster mode, so it has good expansibility. Distributed message queue is applied to spatial data services to reduce the system coupling degree, improve the expansibility, and to buff instantaneous request surge; Distributed cache is used to reduce the complexity of spatial data access logic and calculation logic, reducing the delay. For the spatial data service implementation, the spatial data model and spatial data service deployment have been designed to simplify the service development. In view of that the Larger amount of data, the more resource consumption, we choose the monotone service mode for spatial data service building.(2) Considering the characteristics of Redis which is a high-performance NoSQL database, a distributed memory cache called Redis-RCache and message queue called Redis-RMQ has been designed based on Redis. Using the consistency hash algorithm, we design a Redis-based cluster architecture which supporting Redis-RCache and Redis-RMQ well. Taking the application scene of distributed memory cache into account, Redis-RCache has been built by the cache item structure, cache coherency strategy, replacement algorithm, etc. Also, in the course of Redis-RMQ design, we take the typical application scenarios of distributed message queue into consideration, design the queue message structure and the message visibility strategy, and focus on the process of toxic message and the life cycle of message.(3) By a detailed analysis of the existing spatial data model and the Cassandra column-oriented storage model, a storage model of massive spatial data in the column-oriented storage has been built, and a spatial data engine has been designed for the model as well. In the column-storage model, the spatial data storage are organized in the way of data set group-data set-data description-data block. Three kinds of spatial data partitioning algorithm are designed in this model. The block method is capable to store the data dispersedly into the cluster, improving concurrent read and write ability. The spatial data engine designed for spatial data model, by establishing the mapping between column-oriented storage and spatial data model, to separate the business logic from the basic storage structure.(4) To solve the problem that difficult spatial query and index area change is easy to cause the index reconstruction, by a detailed analysis of the existing spatial data index mechanism and indexing method, we design a distributed and scalable quadtree indexing mechanism called DAE-QTree and its retrieval method in the column-oriented storage environment. The indexing mechanism, the encoding method, the storage structure and the inserting, deleting method are introduced in detail. The index divides the index region into a series of grids, and each grid establishes quadtree index to expand the index region. The index information are dispersed into the cluster by creating a second column family, and in this way a rapid positioning of massive data index was capable to be realized. After a detailed analysis of the classic two-step spatial query method, a spatial query method has been designed in the column-oriented storage using this mechanism.
Keywords/Search Tags:Massive Spatial Data Storage, Spatial Data Storage Model, SpatialIndex, Cassandra, Cloud Computing
PDF Full Text Request
Related items