Font Size: a A A

Development And Application Of An Automated Air Quality Forecasting System (AI-Air) Based On Machine Learning

Posted on:2024-07-31Degree:DoctorType:Dissertation
Country:ChinaCandidate:H B KeFull Text:PDF
GTID:1520307106473714Subject:Atmospheric physics and atmospheric environment
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of social economy,people’s quality of life has been improved rapidly,and the air quality problem has increasingly attracted more public attention,which poses more requirements and challenges for the forecasting of ambient air quality.As the current mainstream forecasting method,numerical models are widely used in the air pollutants forecasting.However,the prediction performance of models will be affected by the uncertainties of the emission inventory,initial and boundary conditions,atmospheric oxidation and the current chemical and physical processes and parameterization schemes,which affects the simulation results of a model,resulting in certain unexpected errors.With the continuous development of machine learning,the application of artificial intelligence(AI)technology to the field of environmental weather forecasting shows a great potential.This research has developed an automated air quality forecasting system(AI-Air)by implementing machine learning technology.Based on the knowledge bases and modules of the internal multi-model selection,hyperparameters optimization,feature selection,ensemble model,feature importance analysis,evaluation and storage,the system has achieved the ability to automatically find the best"model+hyperparameters+input features"for different cities and pollutants,obtaining the optimal forecasting results.The whole process can be fully automated without manual intervention.Based on the automated air quality forecasting system,this research has carried out the following three applications,with main results as follows:(1)Research on the forecasting of pollutant concentrations and levels by the AI-Air.Based on the meteorological observation data,pollutant concentration data,pollutant emission data and model reanalysis data in the knowledge base,the forecasting performance of the AI-Air for different pollutant concentrations and levels is studied by using the five-year observational data(2015-2019)for Beijing,Shanghai,Guangzhou,Chengdu,Xi’an,Wuhan and Changchun in China.It is found that the optimal model selected by the AI-Air varies for different cities and pollutants.The correlation coefficient(R)of the system forecast basically exceeded 0.70,with a small deviation in the mean error(ME)and root mean square error(RMSE).The mean deviation(MB)values were greater than zero,except for ozone,the mean absolute percentage error(MAPE)values were mostly within 40%,the accuracy(P)values were around 0.7-0.8,and the index of agreement(IA)values were greater than 0.8,showing a satisfactory forecasting performance for multiple pollutants and cities with different climatic and pollution characteristics.The percentage of the pollutant level error at 0 and 1 accounting for more than95%for all pollutants in the cities means that the actual situation can almost be reproduced by the forecasting system.Further,the results of the next 3-days forecasts of the system were obtained,with the average R values exceeding 0.70 for all forecast times,which was superior to most of the current operational numerical model.(2)Research on the optimization of model forecasts by the AI-Air.Based on the ground-observed,model-forecasted pollutants and meteorological data as well as some auxiliary variables,46 national key cities were selected as the application cases to study the adaptive optimization of the automated system for the Chinese operational air quality forecasting model-CUACE.Results showed that the AI-Air can intelligently select the best applicable models for different cities and pollutants,and it was found that the majority of the optimal models was the Ensemble model(SG),Support Vector Regression(SVR)and Gradient Boosting Decision Tree(GBDT).The study also further analyzed and discussed the spatial distribution of the optimal model,and preliminarily obtained the reasons for selecting the optimal model.After the system optimization,for PM2.5,the average values of ME,MB,RMSE,R,P,MAPE,and IA changed from 22.4μg/m,-0.6μg/m,36.1μg/m,0.49,0.33,78.9%,and 0.62 to 11.7μg/m,-0.5μg/m,20.3μg/m,0.72,0.66,46.0%,and 0.81,with other pollutants such as O3,PM10,SO2,CO and NO2,obeying the similar rules,improving the future 5-days forecasts of the operational CUACE model.Further,through the impact analysis of the input features by SHAP,it was found that the dominant input features vary for different cities and pollutants.In addition,a new XGBoost-SMOTE hybrid model was developed,which can balance the uneven proportion of high-pollution samples and normal samples,and greatly improve pollutant extreme values on high-pollution days,thus significantly improving the optimization performance of numerical model forecasts.(3)Research on the forecasting of Chengdu Viewing Snow Mountain Index(CVSMI)by the AI-Air.The five levels of CVSMI are defined,with the Level-1 as the disappearance of the snow mountains and the Level-2 to Level-5 as gradually more visible in Chengdu.Based on the recorded data,numerically forecasted meteorological data and pollutant concentration data in the knowledge base,the forecasting effect of the AI-Air on the CVSMI is studied.An analysis of the CVSMI showed that the number of snow mountain visible days increased year by year,mainly focused from April to October each year.Considering the sample balance problem comprehensively and based on the weighted-F1 evaluation index ranking,it was found that the LightGBM,LR and OneVSRest models showed the best forecasting effect for the CVSMI,with the weighted-F1 values exceeding 0.83,achieving satisfactory forecasting performance for different CVSMI levels.The results further verify that the automated forecasting system has good universality and broad application prospects in different fields.In addition,the influence of different input features on the forecasting of CVSMI was further analyzed.
Keywords/Search Tags:Automated air quality forecasting system(AI-Air), Knowledge base, Pollutant concentrations and levels, Optimization, Chengdu Viewing Snow Mountain Index(CVSMI)
PDF Full Text Request
Related items