| Traffic flow forecasting is one of the supporting methods of traffic management and control system.Its accurate and reliable forecasting results can be used as input parameters for traffic control and provide theoretical support for control decision.The accuracy of the prediction results depends on the accuracy of the model construction.In view of this,the research on the distribution characteristics and modeling of short-term traffic flow forecast is as follows:Firstly,since the data of the distribution feature analysis and modeling is the short-term traffic flow forecasting residual,this paper analyzes the traffic preprocessing method,such as the method of outlier data,missing data.The SARIMA prediction model is established for the pre-processed traffic flow data,and the traffic flow forecast and the forecast residual are obtained.The accuracy of the prediction results of SARIMA modeling under different time aggregation is analyzed,and the conclusion is drawn that the accuracy of prediction results increases with the increase of time aggregation.Secondly,the normal distribution test is established for the traffic flow forecast residuals,and different test methods are designed.According to the theory of normal distribution test,the difference of grouping will affect the result of normal distribution test.Therefore,the traffic flow forecasting residuals obtained by different time aggregation are grouped by year,month,week and day.The hypothesis test of the normal distribution of traffic flow prediction residuals under different groups is carried out.Through the different test methods,the normal distribution hypothesis test of the traffic flow prediction residual is realized.The test results show that the normal distribution of the predicted residuals in the uncertainty prediction of short-term traffic flow is false.The proportion of the normalities of the traffic flow prediction residuals is different in different groups and different collections.As the aggregation rate increases,the probability of rejecting the normal distribution is higher.As the year,month,week,and day groups are refined,the probability of rejecting the normal distribution is smaller.In addition,the traffic flow prediction residuals which do not conform to the normal distribution hypothesis are distributed and fitted.The T distribution and the GED distribution are a typical thick tails,and the thickness of the tail can be adjusted by the size of the shape parameters.In this paper,we first introduce several typical distributions and practical meanings for describing the characteristics of spikes and thick tails,and then distribute the distributions for the residuals of traffic flow prediction.The fitting results show that the normal distribution will be too low to fit the peak and tail of the short-term traffic flow prediction residual.Logistic and Cauchy distributions can only be better fitted with spikes.The fitting effect of Logistic distribution on the posterior tail is comparable to that of normal distribution.The Cauchy distribution will over-fit the tail,and the T distribution and the GED distribution can well fit the peak and tail of the traffic flow prediction residual.Finally,the uncertainty prediction under the assumption of normal distribution,T distribution and GED distribution is modeled,and the parameters estimation calculation theory under different distribution assumptions is analyzed.The results show that the t distribution hypothesis is better for the predicted interval of the proposed traffic flow prediction residual data. |