Font Size: a A A

Research On AQI Prediction Model Of Hefei City Based On Boosting Algorithm

Posted on:2022-05-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y K ZhouFull Text:PDF
GTID:2480306482468904Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
Good and bad air quality not only affect people's production and life,but also affect the overall competitiveness of a city to a certain extent.Therefore,people pay more and more attention to the quality of air.Based on the daily mean data related to the air quality related to Hefei City,this paper studies the current situation of AQI and the prediction effect of AQI.First of all,through exploratory data analysis of air quality related data,this paper intuitively describes the overall profile and the law of change of air quality,pollutants and meteorological factors in Hefei City.And Pearson correlation coefficient is used to create correlation coefficient matrix table and thermodynamic diagram.By comparing the correlation between the various features and AQI,the main characteristics and internal relations affecting the air quality index are preliminarily explored.It is found that AQI is mainly affected by PM2.5 and PM10,and there is a collinearity between multiple characteristics.It can be seen from the analysis that the improvement effect of air quality in Hefei is remarkably,the quality of local people's living environment is constantly improving.However,the seasonal difference in air pollution is still very obvious.It should take corresponding improvement measures in winter and spring,controlling emissions from air pollutants such as PM2.5 and CO to avoid further deterioration of air quality.Secondly,the Lasso regression methods is used for feature selection after pretreatment of data and selects performance evaluation indicators for the establishment and improvement of the model to set the standard of comparison.Then,the three predictive models of XGBoost,Light GBM,and Cat Boost are established,respectively.And the performances of the models are improved by adjusting the parameters.Finally,the effects of the final model are compared and evaluated,and it is found that the three models have their own advantages and disadvantages.Overall,Catboost prediction model is the best.Finally,because the accuracy and efficiency of the three models have their own shortcomings,Stacking is used to perform model fusion and improve.The three models established in this paper are used as the basic model of the first layer of model fusion.The output predicted by the model is regarded as the input variable of the second layer model,and linear regression is selected as the second layer meta model for stacking.The final result predicted not only improved the fitting effect,reaching 99.45%,but also the accuracy of the prediction was further improved without over-fitting.According to the analysis results,it can be concluded that the model can effectively predict the daily average value of air quality index,and provide scientific and effective theoretical guidance for the relevant departments to formulate and issue air pollution early warning.
Keywords/Search Tags:air quality prediction, correlation analysis, feature selection, Boosting algorithm, model fusion
PDF Full Text Request
Related items