| Under the background of traffic industry informatization,traffic police bureau(hereinafter referred to as traffic police bureau)is faced with the problems of large data volume,lack of scientific innovation ability and lack of professional talents.Informatization reform is an effective means to make traffic police department get rid of the dilemma of "difficult data analysis".Improving the ability of modeling and analyzing traffic massive data is undoubtedly the key to the informatization reform of Traffic Police Bureau.Faced with the problems of weak SQL foundation of business operators,lack of efficient and reliable data sharing model and low efficiency of data query in Traffic Police Bureau,this paper studies data sharing model technology and computing engine technology,solves the problem of data sharing and data analysis in Traffic Police Bureau,and promotes the transformation of traffic police Bureau to data-driven mode.First of all,this paper designs a combined computing engine that can cover the needs of multiple types of queries,and uses it as the power engine of the data analysis platform.The calculation engine is selected from the mainstream calculation engines in the industry through the goodness evaluation method,and two calculation engines,Kylin and Greenplum,are selected to form a combined calculation engine for the traffic police bureau.In order to avoid the excessive expansion of the data cube,this paper adopts the method of association rule analysis based on query logs for the construction of the data cube in Kylin,and digs out the internal relationship of the aggregated dimensions,thereby pruning and optimizing the cube spanning tree.At the same time,this article expands Greenplum’s data distribution strategy.When the number of cluster nodes meets a certain rule,a two-time consistent hash algorithm with lower search complexity is used to complete data distribution to avoid the appearance of the "barrel effect" Affects the computing performance of the Greenplum cluster.Because the combination of computing engines causes too many system components to be used,which leads to the scheduling problem of system application resources,this paper uses Yarn components to perform unified resource scheduling and management on Kylin,Greenplum and related application components,and combines heuristic algorithms to optimize their scheduling Strategy.Secondly,in order to make the calculated data model more securely and efficiently shared to the data demander,after analyzing several traditional data sharing mechanisms,this article proposes a blockchain-based sharing model and compares the model Responsibilities analysis and selection at all levels.In view of the fact that the traditional blockchain system does not have high support for query diversity,a solution that uses an external database to expand as the query layer is proposed,and the advantages and disadvantages of the solution are analyzed from multiple angles.Finally,the traffic data analysis platform is implemented according to the above-mentioned theoretical design.Through the configuration of related components and the establishment of the auxiliary environment,the pre-preparation of the overall architecture of the platform is completed.Then,through multiple sets of query statements covering the business scenarios of the traffic police bureau,the platform is tested to verify that the big data platform designed in this paper can solve the problems of analysis difficulties,query inefficiency,and data sharing difficulties caused by the large volume of traffic data,thereby promoting The information construction of the Traffic Police Bureau will accomplish the goal of "smart transportation" better and faster. |