| With the rapid development of urban economy,the problem of urban traffic congestion is becoming more and more serious.As a public transportation facility with large passenger capacity and small road resource area,buses can significantly share the pressure of urban public resources.Therefore,it is very important for the development of urban transportation to establish an efficient and efficient intelligent public transportation system,the accurate prediction of bus passenger flow is one of the important premises of intelligent transportation system for scientific vehicle management.However,most cities still use manual judgment and decision-making for bus vehicle scheduling management,which is inefficient.Massive system data brought by the construction and development of smart bus systems in major cities such as GPS data for bus operations,bus line data,and passenger IC cards Data,etc.have not been fully mined,and data resources have been greatly wasted.Therefore,it is imperative to establish an efficient prediction model,fully excavate bus-related data information,solve the problem of accurate prediction of short-term bus passenger flow,and provide real-time and comprehensive basic data for urban bus planning and operation management.This thesis first describes the data structure and original data pre-processing methods of the original data related to public transport,and analyzes the key fields and core data used in the short-term public transport passenger flow forecast in this data structure.Aiming at the problems of performance loss and low accuracy of the existing passenger flow prediction models,a decision tree model based on a multi-feature Gradient Boosting Decision Tree model is proposed.Gradient Boosting Decision Tree algorithm is widely used in various prediction fields,such as financial credit evaluation,commercial product allocation,traffic flow prediction,etc.However,there are not many applications in the field of bus passenger flow prediction.Using the flexibility of the algorithm for the processing of complex traffic data,we established a Gradient Boosting Decision Tree prediction,and by analyzing various factors that affect the distribution characteristics of passenger flow,features related to passenger flow are mined from the corresponding data,by constructing weeks,time periods,environment,etc.Fusion with models to increase model prediction accuracy.Finally,experiments running on real traffic data in Guangzhou show that compared with the other three commonly used prediction models,the multi-feature Gradient Boosting Decision Tree model can accurately and more effectively predict bus passenger flow.At the same time,the comparison experiment of the model after multifeature fusion also proves that the features constructed based on data mining analysis in this paper can effectively improve the performance of the model.Finally,the information about passenger flow was visualized. |