Font Size: a A A

A Prediction Model Of PM2.5 Concentrations In Shanghai Based On Random Forest

Posted on:2018-09-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y C WangFull Text:PDF
GTID:2321330515451461Subject:Cartography and Geographic Information System
Abstract/Summary:PDF Full Text Request
With the rapid development of China's economy and the acceleration of industrialization and urbanization,air pollution problems which mainly about PM2.5 are becoming more and more serious.Haze and other air pollution problems not only seriously affect people's daily lives and physical health,but also has a great impact on sustainable development of society.Therefore,it has great practical significance and social value to accurately predict the concentration of atmospheric pollutantsThe concentration of PM2.5 in Shanghai is studied from the following aspects in this paper.Firstly,this paper comprehensively considers the influence of other factors on the deficiencies in the original data,and uses the KNN algorithm to fill them,fluctuation after filling the data and the original data are consistent.Secondly,this paper analyzes the distribution of PM2.5 in Shanghai from the monthly and weekly time scales,and respectively summed up the monthly average concentration change of PM2.5,the monthly air quality ratio,the PM2.5 concentration changes per day in a week.Thirdly,the correlation between PM2.5 and other pollutants and between PM2.5 and meteorological factors was analyzed.Then the Pearson correlation coefficient matrix was calculated,and the magnitude and direction of the correlation between PM2.5 and other factors was identified.At the same time,the stepwise regression simulation equation is established to determine the stopping criterion of the Akaike information.The experimental results show that the goodness of fit of the stepwise regression equation is increased from 66%to 85%.Finally,the hourly model and the extreme value model for PM2.5 mass concentration prediction were established by using the random forest algorithm.Respectively predict the PM2.5 concentration per hour in the next 1 to 6 hours,and predict the maximum and minimum PM2.5 concentration in the next 6 to 12 hours,12 to 24 hours,24 to 48 hours.The experimental results show that the prediction accuracy of the random forest algorithm is more than 90%,compared with the reference model,the accuracy is improved by 30%.By selecting the optimal subset of variables based on the OOB error estimation method,the goodness of fit of the model can be increased by 1.05%.
Keywords/Search Tags:PM2.5, stepwise regression, OOB error estimation, random forest
PDF Full Text Request
Related items