Font Size: a A A

Research And Implementation Of Forecasting Technology For Bus Lines Passenger Flow Based On Spark Platform

Posted on:2018-10-28Degree:MasterType:Thesis
Country:ChinaCandidate:M Q ZhaoFull Text:PDF
GTID:2322330518993399Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapidly development of public transport and the wide use of public transportation smart cards, the smart card charging systems produce the massive card transaction data every day. Distributed processing of the card data to obtain the useful bus passenger flow information has become a research hot topic. At present, the prediction algorithm of the bus passenger flow is not accurate enough in the actual scene, the model is not well fitted and it is usually lack of efficiency.Meanwhile, the negative correlation between the data magnitude and the execution efficiency is not solved, and the high calculation complexity makes it difficult to be deployed on a distributed cluster.The existing problems of the traditional method show that the collaboration between the forecasting method of bus passenger flow and the cloud platform computing framework is especially important. Firstly,this paper presents a nonparametric stochastic modeling method(simHash), which aims at processing the card transaction data efficiently.This method combines the historical card transaction data of bus lines with the weather data, in order to design a feature set of the bus passenger flow from the different aspects such as time, card type, weather and so on.SimHash processes the similar data instances by using a wider range of feature mapping function, which makes the prediction model more accurate. Secondly, this paper presents a method of passenger flow forecasting based on the simHash modeling method. We use simHash to convert the bus passenger feature data into a hash code, and then randomly divide it into the related partitions to establish the model. In each partition it can train and forecast the model independently by using the algorithm. The presented method makes use of the reasonable training and forecasting methods to process the data in a distributed manner,which significantly improves the execution efficiency, and effectively solves the problem of large computational cost compared with the traditional tree structure. Finally, in order to verify the bus passenger flow forecasting method proposed above, this paper builds the application of bus passenger flow analysis and prediction on the Spark platform. By carrying out the experiments on the actual card transaction data, the bus line data and the weather data, the results show that compared with the traditional forecasting method and the presented method under stand-alone system, our forecasting method cannot only improve the prediction accuracy, but also significantly enhance the execution efficiency of massive data. Therefore, it solves the contradiction between the data magnitude and the execution efficiency.
Keywords/Search Tags:bus passenger flow forecasting, card transaction data, nonparametric stochastic modeling, Spark
PDF Full Text Request
Related items