| The expansion of the city scales leads to an increase in car ownership,which makes traffic jam problems gradually serious.Developing public transportation becomes the No.1methodology of the city transportation problem.Bus operational data based bus passenger OD mining,that is to say,matching and joining the boarding and alighting records of each passenger from the bus operational data,and then analyse the station passenger flow data on these data,to provide the decision basis for the bus line optimization and the bus arrangement.Taking the passenger analysation and prediction based on the OD data mined is a crucial step for the implication of the urban intelligent transportation system,and it’s a researching hot spot on the public transportation area.Owing to the bus operational data have a large volume and a low-value density,the methodology of bus OD mining was developed with a key consideration of mining efficiency and success rate.The multi-angle analysation and prediction of passenger flow were launched based on the bus OD.The main work and innovations are followed:(1)To overcome the difficulty in application and to improve the efficiency,the bus bigdata platform was built base on the bare metal visualization platform and the big-data component.The basic bus OD mining method was brought out based on this platform.The Hive based spatial-temporal relationship measurement and the resource-oriented Hive turning improved the efficiency of bus OD mining.The methodology mined tens of millions of bus OD records from tens of millions of card tapping records with acceptable offline time costs under production conditions,the result can overcome the travel and attraction verification.This signifies that the methodology satisfies the requirement of production environment based offline mining.(2)Aiming at the problem of a low success rate.Firstly,the boarding stations were matched based on the time threshold,Secondly,the candidate station set was designed to implement the methodology of boarding station matching based on the number of boarding passengers and the methodology of alighting station matching based on probability.And then,to implement the candidate station set,the Hive UDF was used.Hive turning methods was used to reduce resource-wasting,as a result,100% O matching rate and 99.6% D prediction rate was reached.Finally,the visual analysis of the OD matrix,the cross-section passenger flow,the cumulative passenger flow was implemented and the reasonable suggestions are made for the bus management departments accordingly.(3)Non-symmetric Contextualized Spatial-Temporal Network-based OD prediction module was proposed based on the CNN and the ConvLSTM,which includes two parts:Station Spatial Context and Spatial-temporal Context.The former embedded the nonsymmetry relationship between the OD matrix and the DO matrix,which makes the prediction skewing to the OD matrix part and guarantee the accuracy and stability.The latter implements the basic function of prediction based on ConvLSTM.Comparing with the other five modules,the accuracy of NSTN reached the highest based on the measurement of SMAPE and RMSE.NSTN will remain that high accuracy in the case of changing the dataset to another bus line.The predicted OD matrix can overcome the travel and attraction verification,which signifies the high generalization ability and the availability for the real requirement in the OD matrix prediction. |