| With the explosive growth of traffic data,traditional traffic data processing technology are inefficient face PB-level traffic data,but cloud computer provides a solution to this problem.Traffic cloud combined traffic big data and Hadoop cloud platform,using No SQL database HBase to storage and processing traffic data.HBase uses horizontal expansion to storage large amounts of data through multiple inexpensive servers,and it has high reliability and high stability features.Firstly,we propose a cloud method based on HBase to storage traffic big data.Traffic RDBMS inefficient and low-volume in storing traffic big data,and it random reading and writing,so we choose HBase storage traffic data with high interactive access efficiency.We build the HBase table on specific rowkey by investigating and analyzing historical traffic data.Additional,we build specific secondary index for HBase based on primary key scan,which force the query speed.Secondly,We propose a SQL query method by build Phoenix on HBase.Native HBase not support SQL query and could only obtain data by specific rowkey or global scan.When user faced with a database does not support SQL query,it is difficult to accept who used to SQL query.Thus,we adopt Phoenix to translate the SQL query into HBase scan,and it not only convenient use of HBase,but also improve query efficiency.Finally,We propose an machine learning approach to auto tuning HBase configurations.The system configuration parameters as the basic information for allocating resources during the cluster work process are directly determines HBase cluster performance is good or bad.The HBase has up to 200 configuration parameters and overall performance is lower in default configuration.Most developers would configure this manually when faced such large number of parameters,but it is time-consuming and not global optimal.For this reason,we adopt machine learning algorithm,which contain training model by random forest and searching best-config by genetic algorithm.This method will fast find best config in high probability.We design and implement traffic data processing system through HBase,and then optimize the query in this system.Finally we tested overall system performance,and it show that HBase storage big data traffic data is reliability and efficient. |