Font Size: a A A

Optimization Techniques Of Traffic Data Processing Based On HBase

Posted on:2016-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y K QiaoFull Text:PDF
GTID:2272330467993342Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The rapid development of big data and cloud computing technology provided new idea for massive data storage and query outside of relational databases. Facing of massive data processing, lacking of processing power and scalability limits its use. In the field of intelligent transportation system, a variety of monitoring data has shown signs of massive growth trend. There has exposed many problems when using relational databases dealing traffic data, such as lack of high-speed real-time data streaming write performance, as well as low effiency of data query when data scale increases to tens of millions or billions level.As the de facto standard of massive data processing, Hadoop/MapReduce is used to treat the business needs of massive data scenarios by a wide variety of major companies. On top of Hadoop framework, HBase has a good advantage, HBase can use Hadoop file system to store its data while using MapReduce framework to process paralevel computing, thus having high reliablity and high availablity.To solve the performance bottleneck of intelligent transportation encountered when based on traditional relational database system, we use HBase as the traffic flow data storage engine, and study the storage and query problem on this basis. The main work is as folows:Firstly, study and analyze related technology deeply, focusing on HBase architecture, data model and its lightweight co-processor framework, and design the traffic data storage model in the center of composed HBase Row Key based on HBase data model.Secondly, propose traffic flow data real-time storage system based on HBase, by designing multi-level processing architecture, the system consists of pre-processing layer, data caching layer and data write layer, through multi-collaboration, data caching, improve the real-time processing ability.Thirdly, combined with real requirements, we introduced secondary indice on the non-rowkey column, improve the query efficiency on non-rowkey column and introduced SQL parser module, by optimizing the sql parsing, the sql was executed by relevant execute engine, we can easily query at the same time improve the query efficiency.Finally, at last we take experiments on the whole system, the experiments show that our real-time storage scheme and secondary index and SQL parser can meet the real needs.
Keywords/Search Tags:HBase, data process, intelligent transportation system, real-time storage, secondary index
PDF Full Text Request
Related items