Font Size: a A A

Rainfall Prediction Based On CatBoost Algorithm

Posted on:2024-09-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:2530307106486094Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
In contemporary society,due to the impact of climate change,the shortage of water resources and the needs of agriculture,water conservancy,disaster prevention and reduction and other industries,accurate prediction of rainfall becomes very important.Rainfall is a meteorological factor that reflects the basic situation of precipitation in a certain region.It has a direct impact on natural processes such as runoff of surface rivers,soil moisture and vegetation growth,as well as human activities such as food security,water conservancy projects and urban planning.However,the variation of rainfall has strong spatio-temporal heterogeneity and randomness,and is affected by various meteorological systems and weather phenomena,so it is difficult to describe and predict with simple rules.In order to improve the accuracy and reliability of rainfall prediction,based on the satellite band data and the hourly rainfall data of ground meteorological stations,this paper uses CatBoost model and Optuna automatic optimization framework for rainfall prediction,and realizes its application in the meteorological field.At present,most studies on rainfall prediction are based on previous rainfall time series data,using time series models such as ARIMA to build simulation and prediction research,without using data collected by meteorological satellites.Therefore,this paper selects the hourly image data collected by the Himawari-8 satellite during the four months from September 1,2021 to December 30,2021 from 56 meteorological stations in southeast China,and reads 16 band data from the image data,and combines with the traditional meteorological data such as temperature,dew point temperature and relative humidity collected by ground meteorological stations.A total of 21 variables are combined to predict rainfall.In the data preprocessing stage,the missing value and duplicate value check and outlier value processing are completed.Then,exploratory analysis is carried out.Descriptive analysis and visual analysis are carried out on each variable successively to explore the relationship between each variable and rainfall.In the forecast of rainfall based on satellite band data,this paper selects CatBoost model which can obtain the importance of variables,and on this basis selects Optuna superparameter automatic tuning framework to select the optimal value of the superparameter in CatBoost model.And choose the most common machine learning model XGBoost model for comparative study.From the analysis of several evaluation indexes of the model,the prediction effect of CatBoost model is almost the same as that of XGBoost model,only slightly improved.Compared with the original CatBoost model,more positive samples could be found in the CatBoost model automatically optimized based on Optuna superparameter,and all evaluation indexes were improved.The accuracy rate was 0.871,the accuracy rate was0.670,the recall rate was 0.756,and the AUC value was 0.897.CatBoost model selected band15,instantaneous wind speed,relative humidity,band 16 and ground temperature as the first five most important variables in this model.To sum up,CatBoost model based on automatic optimization of Optuna superparameters explored in this paper is a feasible and effective rainfall prediction method,which provides a new idea for the study of rainfall prediction from both data and algorithm levels.The super-parameter optimization framework can improve the prediction effect of machine learning model,and mining the information in the band data of meteorological satellites can also help to improve the accuracy of rainfall prediction to a certain extent.It can provide reference and support for meteorological forecast and disaster prevention and reduction,so as to provide guarantee for people’s production and life safety.
Keywords/Search Tags:Rainfall prediction, CatBoost algorithm, Hyperparameter optimization
PDF Full Text Request
Related items