With the improvement of urban rail transit infrastructure,more and more travelers choose urban rail transit as a mode of travel.Accurately prediction on entrance and exit passenger flow is conducive for the operation department to make a more reasonable operation plan to improve passenger comfort.However,the shorter the time interval,the stronger the randomness of the passenger flow data within the unit time,which not only brings new challenges to the short-term prediction on passenger flow,but also increases the difficulty of subway operation management.As the types of available data increase,research on short-term prediction on passenger flows of urban rail transit station has started to develop in the direction of combined prediction and ensemble learning.Therefore,combined with the prediction framework based on ensemble learning,methods of model fusion and parameter optimization are proposed to achieve a more accurate prediction on entrance and exit passenger flow of urban rail transit station.First,the passenger flow data is analyzed to extract conventional characteristics such as time point and passenger flow at the same time of the last week.In view of the dimension of influencing factors is single in existing studies,the external factors such as weather,traffic congestion index and air quality are supplemented as the influencing factors of passenger flow of urban rail transit stations.Secondly,the preprocessing of passenger flow data is completed through data collection and data cleaning,and the features are filtered by calculating correlation.In order to improve the efficiency and accuracy of the model,an improved station coding scheme based on word2vec is proposed,and a Yeo-Johnson power transform is used to complete the normalization of passenger flow data.Then,a short-term passenger flow prediction framework based on Boosting is constructed.Two ensemble learning algorithms including XGBoost and Light GBM are introduced and a model fusion scheme is proposed.Aiming at the low efficiency of the cross-validation grid search as a parameter adjustment scheme,a data set segmentation scheme based on time series is proposed,and a Bayesian optimization algorithm is combined to determine the optimal combination of parameters.Finally,an example analysis of a city’s urban rail transit system in China is conducted,and the validity of the short-term passenger flow prediction model based on Boosting is verified through comparison between various models.The sensitivity analysis of external features and station coding schemes proves that external features can reduce the average absolute error of prediction by 2.13%,and the improved station coding method based on word2vec can further reduce the prediction error while reducing the time-consuming of the model.In addition,the fusion model based on Bayesian optimization proposed in this thesis reduces the prediction error by 8.03% compared with the single model,and the prediction efficiency is significantly higher than the fusion model based on cross-validation grid search. |