| In recent years,the rapid economy development in China has brought many environment pollutions,such as the haze,a type of serious air pollution,putting air quality and residents’physical health in danger.PM2.5,one of the main sources for haze particles,can cause extremely serious damage to the human respiratory system,thereby decreasing human’s life expectancy.Besides,influencing the climate directly and indirectly,PM2.5 has threated the living environment of human beings.As a result,it’s theoretically and practically significant to establish an intelligent early-warning system to accurately predict PM2.5 pollution,protecting human beings from environmental risks.This study has been rooted in machine learning and other related technologies.Based on the interactive effect of the multi-parameter for PM2.5 concentration prediction,with three relatively mature machine learning algorithms,especially the Stacking Integrated Learning Algorithm(SILA)for fusion optimization,this study has designed a hybrid model of prediction system with high accuracy.Different from traditional machine learning algorithms,the SILA was a collection of multi-layer models.For example,in terms of a two-layer model,the SILA comprises a training set and a test set.While the training set aims to optimize the training for the first-layer learner,the test set will be added to the optimized learner for training,the result of which will be input to the model of the next layer.As the output value,the final label marks the completion of the training for the secondary learner.This study mainly explored the following contents:After presenting the historical development of PM2.5 concentration prediction at home and abroad,this study briefly introduced the composition and formation of PM2.5.Besides,this study explained the basic principles of the algorithm models for predicting PM2.5 concentration,coupled with the related theory and technology required for designing system.The key to PM2.5 concentration prediction is the accuracy of the algorithm module.In order to overcome the barriers of complicated procedures and insufficient precision in the existing PM2.5 concentration prediction model,this study selected Beijing as the research context.The air quality data(from 2010 to 2014)were obtained from Beijing PM2.5 Data Set.After the feature correlation and feature importance analysis,the data were submitted to a feature polynomial expansion method to combine the features and generate new features.With the XGBoost feature importance,the new features experienced a preliminary screening.They further went through a careful filtrate with the exhaustive verification method to determine the optimal combination of input features.Built on the interactive function of the multi-parameter for PM2.5concentration prediction,this study employed three relatively mature machine learning algorithms,especially the SILA for fusion optimization.At the same time,cross-validation and network Grid search method were applied for adjusting parameter optimization.The final result demonstrated that the Stacking Integrated Model generated satisfying effects on average:R~2>0.9,RMSE>50μg/m~3,and MAE>14μg/m~3.Especially,the Stacking Hu Ber Integrated Model has achieved the best training effect,with R~2=0.931,RMSE=50.627μg/m~3,and MAE=14.537μg/m~3,strongly indicating the validity of the prediction model.Finally,supported by the tested Stacking Integrated Model and the Web-based technologies Python plus Django,demand analysis and system design were implemented to devise an intelligent early warning system of PM2.5 pollution.The system enabled the automatic data collection and analysis.Besides,it empowered users to manage the historical meteorological data and forecast data.At the same time,the prediction module was able to predict the PM2.5 concentration.The warning module would send notification to users when the PM2.5 concentration exceeded the threshold set by the system.After the completion of the system development,a comprehensive test of the system was conducted.The test results proved that the actual operation effect of the system was stable and good,and the accuracy of PM2.5 concentration prediction met the actual demand. |