| With the gradual expansion of the scale of urbanization and the increasing income of urban residents in China,the number of private cars is also increasing,which bring a series of traffic problems.In order to facilitate the management of urban traffic,intelligent transportation system came into being.It through the introduction of modern technology and combined with the specific needs of the city,which realizes the real-time collection and processing of traffic information.It supervise the current traffic environment and make corresponding regulation.It has great significance that ensure the efficient operation and sustainable development of urban traffic.Data storage and restoration is one of the core of intelligent transportation system.In the actual application of the Public Security Bureau,monitoring equipment will produce massive amounts of data every day.There will be hundreds of millions of vehicle data one day in Zhejiang Province.And the arrival of the data is random.The traditional relational database can not achieve mass data access operation due to its strict table structure constraints.And when a data table store tens of billions of data,the index itself is too large.So the database can not meet the needs of data retrieval,and can easily result in paralysis of the system.This paper designs a 100 billion intelligent traffic large data storage and retrieval system with in-depth research of these issues.The system adopts the distributed clustering scheme,which is based on the distributed framework Hadoop.The cluster uses Zookeeper and Yarn for consistent management and allocating resources.In order to guarantee the stability of the system cluster,load balancing and high availability mechanism are implemented through virtual IP and Zookeeper,which are used to deal with the problem of high concurrent connection and single point of failure,and ensure the consistency of external address.For solving the difficulty of mass data storage and retrieval,a search engine Solr and a non-relational database HBase are introduced to realize data storage and retrieval scheme.The paper design Kafka and Spark Streaming data cache and consumption strategy for high concurrency data that cause Solr instability.In order to solve the high latency of 100 billion data retrieval,this paper designs Solr sub-core arithmetic and time compression algorithm.It achieves data search of 100 billions in one second.And it designs the flip-page cache function to enhance the user experience.Finally,the system is measured.Measurement results show that the system is stable and can store many types of data.The response times of a variety of conditions are less than one second,when the database is stored over 100 billion car records. |