Font Size: a A A

Distributed Storage And Analysis Of Massive Urban Traffic Flow Data Based On Hadoop

Posted on:2016-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:L J ZhuFull Text:PDF
GTID:2322330488993973Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of intelligent transportation infrastructure as well as the gradual increase of urban residents's income, the number of cars in the city increasing.The induction coil and the bayonet section system spread all over the city can be able to timely collection, recording, summarizing and uploading monitoring data.But due to the large amount of data and high real-time performance of urban traffic flow,the traditional data storage and processing technology have problems such as data structure and data storage capacity, and can not be extended, distributed parallel data mining is difficult, and high tolerance of error resilience. How to upload and store massive traffic flow data in real time and how to carry out statistical data mining is a big problem. The big data technology represented by Hadoop is one of the effective methods to solve this problem.Based on the present urban traffic development brings the data storage and the analysis and other outstanding problems, this paper through to the Hadoop based on MapReduce, HBase and other big data technology research, proposed the corresponding solution, its main research work and achievements are as follows:(1) This paper presents a general framework for the storage and analysis of massive urban road traffic flow data based on Hadoop. The architecture is divided into 5 levels:data acquisition level, hardware platform level, data storage and computing level, mining analysis level and application service level, at the same time, research and design of the node in the case of failure or downtime, the availability scheme of Hadoop cluster with high error resilience.(2) This paper presents a distributed storage scheme based on HBase for massive traffic flow data. According to the characteristics of traffic flow data and application requirements, design a the data table structure to solve the traffic flow line hot issues, research the mechanism of the two levels of HBase at the same time, designed a fast data retrieval method for column queries.(3) Design flow and density calculation model and proposed the parallel implementation of traffic density calculation based on MapReduce according to the relationship between traffic flow and density. Use K-Nearest Neighbor Nonparametric regression algorithm to predict short-term traffic flow, through K-Nearest neighbor state vector, distance measurement, nearest neighbor number and selection and study of prediction algorithm, presente the parallel implementation of KNN based MapReduce prediction for short time traffic flow to speed up the search speed of K nearest neighbor algorithm, actualize timing forecast of short time traffic flow.(4)Finally, according to the overall structure of the application layer needs, construction and implementation of urban road traffic flow data analysis system based on Hadoop platform. The system is designed in detail in this paper, realize the function of real-time monitoring, data analysis and so on.
Keywords/Search Tags:Intelligent transportation, Hadoop, HBase, MapReduce, Nonparametric regression, KNN
PDF Full Text Request
Related items