| Massive data’s storage and query is the prerequisite for big data analysis. How to store and query massive data in an efficient and flexible way becomes a hot spot of industry research. The SCUT energy consumption analysis platform is using traditional relational databases to support its data storage. But due to the potential drawbacks of theoretical model and architecture, it’s hard to meet the needs of big data scenarios in both performance and scalability. No SQL database can solve these problems to a certain extent, but most NoSQL databases simply provide basic functions, they have limited support to complex queries and transaction management. Besides, NoSQL databases don’t have standardized query language or interface. These differences make it hard to be compatible with SQL-based query logic and transfer from relational database to NoSQL database.To solve these problems, this article did a detailed analysis for real-time data’s characteristics. And On the basis of extensive research on technology, a OpenTSDB-base massive real-time data storage System which integrated the advantages of RDBMS and NoSQL is designed and implemented. The main idea of this system is: build a heterogeneous database cluster combine by RDBMS and NoSQL database, data that have strong relational data characteristics and have requirement of complex queries or transaction support, will be stored in the RDBMS. OpenTSDB will keep the massive real-time data. In data persistence layer, this paper presents an innovative design based on aspect-oriented programming. This design use Spring AOP to enhance a data persistence framework for relational database called Mybatis. By enhancing its query method, this system can query both type databases in a single call process and using relational database’s query result to drive the NoSQL query process. The final query result will be returned as the query method declared. This design does not require the source code of the data persistence framework or database to be modified, achieved loose compling between different modules. Besides, it will not affect the business logic of the upper layer, so it has good compatibility, which greatly reduces the difficulty of technology migration. This article also provides optimization schema for sequential reading based on redundancy storage, which will improve the sequential read performance when using different key at the cost of extra storage space.A series of tests conducted on the SCUT energy consumption analysis platform shows that the proposed massive real-time data storage system based on OpenTSDB function well and has good random / sequential access performance. |