Font Size: a A A

A Study On Hierarchical Reconciliation Forecasting And Its Applications Based On Boosting Trees

Posted on:2022-10-09Degree:MasterType:Thesis
Country:ChinaCandidate:Q ChenFull Text:PDF
GTID:2480306728497014Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
For time series data,it can usually be classified naturally according to various attributes of interest.For example,due to geographical division,hierarchical time series often appear.The collection of time series follows a hierarchical aggregation structure.This kind of hierarchical structure of time sequence data is called hierarchical time series data.On the one hand,modeling and forecasting based on hierarchical time series data can ensure the accuracy of the forecast,and on the ot her hand,it can also ensure reconciliation of the forecast,that is,the forecast is accumulated in a manner reconcilable with the hierarchical structure.However,the traditional hierarchical time series model has two obvious shortcomings,that is,the model does not consider exogenous variables,and the accuracy of the model is not high.In addition,the boosting tree algorithm that have emerged in recent years are also very concerned about the accuracy of time series forecasting.It is expected that the combination of boosting tree algorithm and hierarchical forecasting can get better forecasting results.Therefore,this paper proposes three kinds of hierarchical reconciliation forecasting models based on boosting tree algorithm.The hierarchical reconciliation forecasting models learned are the hierarchical forecasting model based on XGBoost,the hierarchical forecasting model based on GBDT,and the hierarchical forecasting model based on Light GBM,which are abbreviated as XGBoost-hts,GBDT-hts and Light GBM-hts.In order to verify the effectiveness of the new model,two case datasets are used in this paper,namely the preliminary data set of hourly power load demand in the 2017 Global Energy Competition data set,and the Australian monthly regional tourism data set from 1998 to 2017.The specific analysis process is as follows:(1)The preliminary data set of hourly power load demand in the 2017 Global Energy Competition data set consists of hourly power load demand data of 10 regions in three levels.This article considers hourly dry and wet bulb temperature,time of day,week type,Explanatory variables such as weekends,holidays,and histo rical power load demand from the previous 1 to 4 hours.Firstly,the data set was preprocessed,and the explanatory variable data of each sequence was sorted by feature importance using the XGBoost algorithm,and then the feature variables whose ranking was stable in the top 8 were initially selected,and on this basis,they were used as the explanatory variables of the benchmark model.For modeling,the three explanatory variables of power load demand in the previous hour,po wer load demand in the previous two hours,and time are particularly important;Secondly,the benchmark models are XGBoost,GBDT and Light GBM.Before modeling,bayesian optimization is used to select some important super parameters of the benchmark model.Each sequence is modeled and predicted by the model with the corresponding optimal value of the parameters,and the corresponding fitting value and forecasting value are saved.Finally,the hierarchical forecasting method is used to calculate the modeling results of the benchmark model based on the sequence,and the hierarchical forecasting results of the corresponding models are obtained.The forecasting performance of each hierarchical consistent forecasting model is compared by calculating the evaluation indexes.(2)The Australian monthly regional tourism data set from 1998 to 2017 consists of monthly tourism data of 110 regions in four levels.This article mainly considers explanatory variables such as holidays,monthly average temperature,tourism index,scenic spot index,and time sequence.Firstly,the explanatory variable data of each sequence is sorted by feature importance using the XGBoost algorithm.Because the output results include some delay variables corresponding to the explanatory variables,and the output results of each sequence are different,so for the 110 regional data of the data set,sorted out the top 10 characteristic variables corresponding to each region as the explanatory variables of the model.Secondly,the benchmark model is established based on XGBoost,GBDT and Light GBM algorithm.Before modeling,bayesian optimization is used to select some important super parameters of the benchmark model.Each sequence is modeled and predicted by the model with its corresponding optimal parameters,and the corresponding fitting values and predicted values are saved.Finally,the modeled result of the benchmark model of the sequence is calculated using the layered forecasting method to obtain the layered forecasting result of the corresponding model.Through the above modeling process,the conclusions obtained are as follows:(1)For the preliminary data set of hourly power load demand in the 2017 Global Energy Competition data set,the average MAPE values of the optimal hierarchical forecasting methods corresponding to the XGBoost-hts,GBDT-hts and Light GBM-hts models proposed in this paper are 0.86%,1.05% and 0.80% respectively and the corresponding optimal the average values of RMSE are 33.19,41.13 and 31.90,while the optimal MAPE averages corresponding to the traditional hierarchical time series forecasting models ARIMA-hts and ETS-hts are 14.06% and 36.09% respectively.Obviously,the forecasting accuracy of the new model proposed in this paper is significantly better than the traditional stratified time series forecasting model,and the optimal hierarchical forecasting method of the new model is mainly embodied in the bottom-up method and the min T method.(2)For the Australian monthly regional tourism data set from 1998 to 2017,it is found that the XGBoost-hts,GBDT-hts and Light GBM-hts models proposed in this paper are the minimum trace method corresponding to the optimal model in the traditional hierarchical time series forecasting model.By comparing the evaluation indexes MAPE,RMSE,MAE and MASE,it is found that the three hierarchical consistent forecasting models proposed in this paper have different degrees of improvement compared with the traditional hierarchical forecasting model,and the order of improvement degree is XGBoost-hts model,GBDT-hts model and Light GBM-hts model.
Keywords/Search Tags:boosting tree, hierarchical reconciliation forecasting, XGBoost-hts model, LightGBM-hts model, GBDT-hts model
PDF Full Text Request
Related items