| This paper aims to use the related algorithms of machine learning to build a haze prediction model that fuses data of multiple sources.First,in this paper,we discuss the relevant factors that affect the formation and evolution of haze,and analyzes the effect mechanism of each impact factor on haze,and divides them into two types of impact factors according to different action mechanisms: ambient air quality and meteorological conditions.The former are mainly common atmospheric pollutants such as NOx,CO,SO2 and O3;the latter are mainly meteorological parameters such as temperature,pressure,relative humidity and tropospheric delay(ZTD);secondly,based on the analysis of haze impact factors,we use the related algorithms of Machine learning to construct multi-timeliness haze prediction models respectively,which mainly include BP neural network algorithms based on a single learner and ensemble learning algorithms based on the ensemblation of multiple learners(GBDT and its enhanced versions: XGBoost,Light GBM),and The data from 2018-2019 in the Liangxiang area of Beijing Fangshan was used as a sample for training and prediction.The relevant research conclusions and innovations are as follows:1)Based on the characteristics of chemical components and the mechanism of physical change during the formation of haze,this paper comprehensively considers various factors affecting the formation and change of atmospheric PM2.5,and adds PM2.5 time series data that reflects the regularity of PM2.5 itself.Based on this,we get a haze prediction model that fuses multi-source data.Compared with the single-factor prediction model,the fusion model is conducive to improving the stability of the model,and it can better reflect the actual change of atmospheric PM2.5 concentration.2)In terms of data selection,considering the advantages of ground GNSS satellite such as wide distribution,continuous observation and higher spatiotemporal resolution of data sampling,the weather data of ground GNSS satellite observation station is used instead of traditional ground weather station data,and the satellite observation such as GNSS tropospheric delay are added.it can realize multitimeliness,accurate and fast haze prediction in a wide range.3)There are many factors that affect the atmospheric PM2.5 concentration and present complex non-linear changes.In this paper,we use machine learning algorithms to build the haze prediction models.By comparing and analyzing the prediction results of each algorithm model,we can know that the overall accuracy and stability of the BP neural network model is poor,which is suitable for short-term prediction;the overall performance of the ensemble learning model is better,Among them: the GBDT model has higher accuracy,but it takes a long time to learn;the XGBoost model has the highest accuracy and lower learning time;the Light GBM model takes the least time to learn,but the accuracy is lost.The relevant conclusions provide some reference for the prediction and prevention of haze. |