| On February 25,2021,in the press conference introducing air pollution prevention and control work,the Ministry of Ecology and Environment of China pointed out that an important task during the "14th Five-Year Plan" period is to basically eliminate severely polluted weather.Accurate prediction of air pollution can provide early warning information and help to take pollution prevention measures in advance,which is of great significance.However,most of the existing studies only focus on how to improve the accuracy of numerical prediction of pollutant concentrations,and the information provided by the numerical prediction results is limited,and it can not provide uncertain information.In contrast,interval prediction contains both certainty and uncertainty information.Therefore,on the basis of considering the influence of meteorological factors,methods for numerical prediction and interval prediction of atmospheric pollutant concentration are proposed in this paper.The two severe pollution period in the early stage of the novel coronavirus pneumonia in Beijing are used as examples to verify the feasibility of the proposed model.The main works of this paper are as follows.(1)Feature engineering based on random forest(RF)and fuzzy information granulation(FIG).Existing analysis of influencing factors mostly perform feature selection only.In order to further improve the accuracy of air pollutant forecasting,this paper conducts feature selection,feature weight calculation and new feature construction.Firstly,the Permutation Importances of all possible influencing factors are calculated based on RF,and the influencing factors whose Permutation Importance equal to 0 are removed.Those calculated Permutation Importance are used as the weight of the corresponding influencing factors.Then,calculate the FIG parameters of these influencing factors and construct new features with more physical meaning on them.Finally,the influencing factors of atmospheric pollutant concentration in the early periods of the novel coronavirus pneumonia in Beijing is analyzed to provide a reference for Beijing’s air pollution prevention and lay a solid foundation for subsequent predictions.(2)An adaptive decomposition and ensemble numerical prediction model is proposed.On the one hand,it is found that the training set and test set of most existing decomposition and ensemble studies are divided based on the decomposition results rather than the original time series,which causes the unreality of the forecasting process.On the other hand,the significant boundary effect in the decomposition result will damage the prediction accuracy to a large extent.To solve these problems,this paper analyzes the cause of boundary effect,introduces partial correlation coefficient(PCAF)to calculate boundary effect time lag,combines continuous decomposition process and particle swarm optimized BP neural network(PSO-BPNN),proposed an adaptive decomposition and ensemble numerical prediction model.Finally,using the air pollutants and meteorological data in Beijing,a case study was carried out on the early periods of the novel coronavirus pneumonia.Two types of comparative experiments are designed to compare and evaluate the prediction effect of the model based on prediction accuracy,stability and robustness,which proves the superiority of the model proposed in this paper.(3)A interval prediction model based on fuzzy theory is proposed.Through the analysis of the existing air pollutant concentration interval prediction,it is found that most of them are based on distribution assumption or the maximum and minimum information of the original time series.However,these two methods have the inherent drawbacks of assuming distribution and information loss,respectively.To avoid these problems,this paper introduces FIG.Base on FIG,the original time series is divided into realistic time windows and converted into fuzzy information granules.Then,the fuzzy information granulation parameters are calculated based on the triangular membership function.Finally,combine the PSO-BPNN to get the interval prediction result.The entire experimental process was verified on the Beijing air pollutant concentration dataset during the early period of the novel coronavirus pneumonia.Two numerical indicators,the true value coverage percentage and the interval width,were used to evaluate and prove the validity of the proposed interval prediction model.Finally,this paper provided certainty and uncertainty prediction information of atmospheric pollutant concentrations,improved the accuracy and reference significance of the prediction results,and provided scientific early warning information and decision-making reference for the public’s daily life and air pollution prevention. |