With the development of the city and the agglomeration of the population,the number of cars in the city continues to increase,and the air pollutants emitted by the surrounding factories have caused the deterioration of the urban environment,and the travel and health of the residents have been seriously affected.Therefore,using urban big data such as urban air quality data,meteorological data,spatial POI,etc.,to build an accurate air quality model,so as to better help residents make travel plans and assist the government to make environmental protection decisions.Starting from the time dimension and the space dimension in the construction of the air quality prediction model not only enriches the research angle,but also applies the concept of data fusion in the research,and integrates the time series with the spatial information.Taking the PM2.5pollutant concentration as an example,this paper explores the feature extraction method of PM2.5sequence in the time dimension and space dimension,incorporates it into the prediction model,and dynamically combines the prediction results of the time and space dimensions to improve the prediction effect.The main work is as follows:Firstly,explore the theoretical algorithms of spatiotemporal models,including the processing of missing data and outliers,and feature selection using correlation theory.On this basis,this paper deeply studies various cutting-edge algorithms for PM2.5prediction:modal decomposition,time series clustering,deep neural network,etc.,and constructs the theoretical framework of PM2.5spatiotemporal prediction model.Secondly,build a spatiotemporal prediction model,and build a temporal and spatial predictor on the basis of analyzing the characteristics of PM2.5prediction in the time dimension and space dimension.In the time dimension,the modal decomposition is used to extract the fluctuation characteristics of PM2.5data,the time series clustering algorithm is used to reconstruct the components,and the time predictor is constructed based on the ELSTM model;In the spatial dimension,the Laplacian operator is used to extract the spatial relationship of the site from the perspective of the graphical model,so as to construct the spatial predictor;finally,XGBoost is used to dynamically aggregate the two parts of the results to complete the LX-M-CEEMDAN-VMD-LSTM model’s build.Thirdly,the PM2.5concentration sequence is predicted by using lanzhou air pollutant concentration data,meteorological data and geographic information.In the time prediction module,CEEMDAN and VMD were used to construct a two-level decomposition method to extract time series information,and then cluster reconstruction was carried out,which not only improved the accuracy of time series prediction,but also further simplified the model by clustering data reconstruction.In the space prediction module,the Laplace matrix can effectively extract the spatial features of data and improve the accuracy of space prediction.Based on XGBoost,the importance of various features is extracted,and the temporal and spatial features are dynamically combined to make up for the deficiency of their respective dimensions.Root mean square error(RMSE),absolute error(MAE)and mean absolute error percentage(MAPE)were used to compare the advantages and effectiveness of the model.The empirical results show that the prediction accuracy of the proposed model is significantly improved,and it is superior to the control model in all indicators. |