A Massive Real-time Data Storage System Base On OpenTSDB

Posted on:2017-04-24

Degree:Master

Type:Thesis

Country:China

Candidate:R Q Shan

Full Text:PDF

GTID:2308330503968499

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Massive dataâ€™s storage and query is the prerequisite for big data analysis. How to store and query massive data in an efficient and flexible way becomes a hot spot of industry research. The SCUT energy consumption analysis platform is using traditional relational databases to support its data storage. But due to the potential drawbacks of theoretical model and architecture, itâ€™s hard to meet the needs of big data scenarios in both performance and scalability. No SQL database can solve these problems to a certain extent, but most NoSQL databases simply provide basic functions, they have limited support to complex queries and transaction management. Besides, NoSQL databases donâ€™t have standardized query language or interface. These differences make it hard to be compatible with SQL-based query logic and transfer from relational database to NoSQL database.To solve these problems, this article did a detailed analysis for real-time dataâ€™s characteristics. And On the basis of extensive research on technology, a OpenTSDB-base massive real-time data storage System which integrated the advantages of RDBMS and NoSQL is designed and implemented. The main idea of this system is: build a heterogeneous database cluster combine by RDBMS and NoSQL database, data that have strong relational data characteristics and have requirement of complex queries or transaction support, will be stored in the RDBMS. OpenTSDB will keep the massive real-time data. In data persistence layer, this paper presents an innovative design based on aspect-oriented programming. This design use Spring AOP to enhance a data persistence framework for relational database called Mybatis. By enhancing its query method, this system can query both type databases in a single call process and using relational databaseâ€™s query result to drive the NoSQL query process. The final query result will be returned as the query method declared. This design does not require the source code of the data persistence framework or database to be modified, achieved loose compling between different modules. Besides, it will not affect the business logic of the upper layer, so it has good compatibility, which greatly reduces the difficulty of technology migration. This article also provides optimization schema for sequential reading based on redundancy storage, which will improve the sequential read performance when using different key at the cost of extra storage space.A series of tests conducted on the SCUT energy consumption analysis platform shows that the proposed massive real-time data storage system based on OpenTSDB function well and has good random / sequential access performance.

Keywords/Search Tags:

Ditributed System, OpenTSDB, HBase, NoSQL

PDF Full Text Request

Related items

1	Research And Development Of Big Data Storage Systems Based On Hbase
2	Research On Unified Storage And Access Interface For NoSQL Database
3	Design And Implementation Of Security Audit System For NoSQL
4	Research On Security Of NoSQL Databases Based On Hadoop
5	A HBase Based Massive Remote Sensing Metadata Search System
6	Design And Implementation Of Mass Data Storage Solution Based On HBase
7	Research On NoSQL Database Index Based On LSM Tree
8	The Research Of Big Data Storage Technology In Cloud Computing
9	Research And Application Of Performance Optimization Of Health Monitoring Big Data Platform Based On Hbase
10	The Research And Design Of Distributed Vertical Search Engine