Font Size: a A A

Research A Model Of The Metadata Hierarchical Storage In The Distributed Data Register Center Based On The DOA

Posted on:2016-12-14Degree:MasterType:Thesis
Country:ChinaCandidate:J J YuanFull Text:PDF
GTID:2308330461455540Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the development of cloud computing and the accumulation of big data, the development of data technology (DT) and data science (DS) research are becoming increasingly urgent, which are used to meet the flexible applications requirements of the "Internet+" era. The data itself has gradually become a kind of capital, and the ideas including data-oriented, data-service, data-centric have evolved. However, it is a major challenge for modern data scientists and data practitioners about how to effectively organize and manage the rapid expansion of massive heterogeneous data. In this context, the Data Oriented Architecture (DOA) comes into being, which is based on analyzing and managing large volumes of heterogeneous data.DOA manages large amounts of data resource pool using system design ideas of data-centric, data identifies main line. Data describing these data of various attributes in the resource pool called metadata, and data register center (DRC) as a core component of DOA manages the vast amounts of metadata. With the expansion of data resource pool, the metadata registered by DRC will dramatically increase, so that the bottlenecks of the traditional DRC performance and its access efficiency are becoming increasingly prominent.On the basis of the existing DOA research results, the paper regards the characteristics of managing large amounts of data, and at the same time it combines with the issues related to distributed management and metadata storage of DOA. First, design a distributed summary model of DRC; Secondly, use the flexible and scalable Hbase to define and store metadata, and design a scalable metadata specifications; The third, store the high priority metadata in distributed Memcached, which can improve the efficiency of access; Then design classification policy of the meta data, which based on a hotspot access and data value; Finally, achieve load balancing of the distributed DRC using algorithm of consistent hash ring with weighted virtual nodes.Main work done as follows:(1) Use the flexible and extensible Hbase database can complete distributed storage of metadata and extend the functionality of data registry, also establish a scalable, distributed data registration center model.(2) On base of the Memcached distributed caching technology, use a hierarchical strategy based on the hotspot access and data value to establish the metadata hierarchical storage mechanism of distributed data registry and improve the timeliness of the data request.(3)Use the algorithm of consistent hash ring with weighted virtual to improve the load balancing of distributed data registry.The main innovation of this article included:(1) Propose a implementation method about distributed data registry of cluster managing massive metadata. The method uses a scalable Hbase database storing metadata with distributed storage to overcome the performance bottleneck of the original single-point data registration centers and the scalable bottleneck of the original data registry using relational database storing metadata; Meanwhile, the method uses the algorithm of consistent hash ring with weighted virtual to improve the load balancing of distributed data registry.(2) Propose a distributed hierarchical metadata storage strategy. This strategy hotspot access using hierarchical classification algorithm and data value to filter priority metadata and use Memcached distributed servers to cache prior metadata, which can quickly and efficiently look up metadata.
Keywords/Search Tags:Distributed Data Register Center, Hbase, The consistency of hash ring, Distributed Cache, Hierarchical Storage, Data Oriented Architecture
PDF Full Text Request
Related items