Application Of Random Forest Model In Fine Particle Concentration Prediction In Taiyuan

Posted on:2018-06-02

Degree:Master

Type:Thesis

Country:China

Candidate:S Q Yang

Full Text:PDF

GTID:2321330536466082

Subject:Statistics

Abstract/Summary:

PDF Full Text Request

Primary pollutant is the most polluted species in the air,whose concentration is an important air quality index.With air quality getting worse,it is very important to take timely measures such as forecasting and early warning.As a typical energy and chemical base,Taiyuan suffers from air pollution for a long time.Air quality problem in the city has attracted many attentions from both the government and the citizens,so it is urgent to carry out further studies on air quality issues in Taiyuan.First,based on previous studies,this study analyzes the data of Taiyuan for daily air pollutant concentration and surface meteorological condition during December 1,2013 and December 31,2016.The feature of air quality and distribution days for different primary pollutant in the three years shows that the major primary pollutant in Taiyuan is particle(PM₁₀ and PM_2.5).PM₁₀ is the major primary pollutant in spring and summer seasons,while PM_2.5 is the major primary pollutant in autumn and winter seasons.During 2014-2016,the number of days with moderately polluted and worse index is 138,among which 17 days in spring and summer and 121 days in autumn and winter.On the basis of the fact that the air quality in Taiyuan is worst in autumn and winter,this study performs the prediction work for PM_2.5 only in these two seasons.Next,the relevant theoretical knowledge of random forest model used in this study is systematically elaborated.Then,based on previous studies and from a perspective of air pollutant and meteorological parameters,this study collect the key factors that influencing PM_2.5 concentration,and analyse the Pearson correlation coefficient and Spearman rank correlation coefficient between PM_2.5 concentration and these factors.Finally,10-fold cross validation method is used to establish PM_2.5 concentration forecasting model based on random forest algorithm,and the result is compared with those of traditional linear regression model,Boosting regression model and support vector regression model.The results shows that the size ranking of predicating performance indexes NMSE,MAE and RMSE in the test set is that: linear regression model>Boosting regression model>support vector regression model>random forest regression model.Meanwhile,the size ranking of predicating performance index R in the test set is that: random forest regression model>support vector regression model>Boosting regression model>linear regression model.In summary,compared with the other three models,the random forest model has the advantages of higher prediction precision,stronger ability of generalization and no need of feature selection.Therefore,this method is worth to apply and popularize in the prediction of particulate concentration in urban areas.

Keywords/Search Tags:

primary pollutant, Taiyuan city, cross validation, random forest regression model

PDF Full Text Request

Related items

1	Application Of Random Forest Model In The Study Of Primary Pollutant In Beijing
2	Research On Influence Factors Of Taxi Emission Factors Based On Random Forest Regression
3	A Prediction Model Of PM_2.5 Concentrations In Shanghai Based On Random Forest
4	Urban PM_2.5 Concentration Prediction Based On Parallel Random Forest
5	Influencing Factors Analysis And Forecast Of PM_2.5 In Fushun City Based On Multi-Site Data
6	Soil Contamination Characteristic And Assessment In Suburb Of Taiyuan City
7	Study On Predicting Model Of Tectonic Deformed Coal Thickness Based On Regression Random Forest
8	Air Quality Forecasting Using BP Neural Network And Random Forest Model
9	Estimation And Spatio-temporal Changes In Net Primary Productivity Using Random Forest In The Hexi Corridor
10	Evaluation And Pattern Optimization Regulation Of Ecological Security In Xiangxiang City Based On Random Forest Model