| With the improvement of Chinese productivity level,the rapid development of the national economy,and the continuous improvement of people’s quality of life,the demand for electricity in the whole society has increased rapidly,bringing about problems such as imbalance between electricity supply and demand and increased carbon emissions,which have a great impact on the industrial economy,social development and residents’ lives.A comprehensive,timely and accurate forecast and analysis of China’s electricity demand can provide support for the solution of related problems,and is of great significance in ensuring the rational operation and planning of the electricity system,promoting the sustainable development of the economy and helping to achieve the carbon peak and neutrality targets.This thesis is based on ensemble learning to forecast the total electricity consumption.The main work is carried out in the following three aspects.1.Construct a system of influencing factors and conduct feature screening.Firstly,this thesis analyse the current situation,structure and influencing factors of the whole society’s electricity consumption,and obtain 22 influencing factors of electricity consumption.Afterwards,based on the non-parametric independent screening theory proposed by Fan et al,the variable selection method of non-parametric independent screening was constructed through independent programming design,and combined with the variance filtering method,correlation coefficient method and random forest method to determine the 9 most important influencing factors on electricity consumption,ensure the reasonableness of factor selection.2.In view of the influence of variables and the time-series characteristics of electricity consumption data,several forecasting models were constructed for the total electricity consumption.The ARIMA model was built to analyse the overall trend and change pattern of electricity consumption in time series,the LSTM model was built to consider the impact of historical levels of electricity consumption and influencing factors on electricity consumption demand,and the XGBoost model was built using the influencing factors of electricity consumption as characteristic inputs.By comparing the average relative error of the above three models in the test set,it is found that the XGBoost model has the smallest average relative error and the best results.3.The ARIMA-LSTM-XGBoost fusion model was established based on ensemble learning.In order to consider both the time-series characteristics of electricity consumption and the influence of external factors on electricity consumption,make use of the advantages of each model,avoid the disadvantages of each model,fully exploit the data information and improve the accuracy of the prediction results,this thesis builds an ARIMA-LSTM-XGBoost fusion model based on Stacking ensemble learning.Comparing the performance of the fusion model with that of the other three models in the test set,the results show that the fusion model has a smaller relative error gap and better stability in each year,with an average relative error of 1.05% and better forecasting results.Therefore,this thesis uses the ARIMA-LSTM-XGBoost fusion model as the forecast model for the total electricity consumption,and achieve the forecast of the total electricity consumption in 2022 and2023.The forecast shows that the demand for electricity in China will continue to increase over the next two years,which will require the joint efforts of the electricity production sector and the power industry,among others,to accelerate the construction of multiple forms of electricity supply infrastructure and ease the pressure of electricity tension.Greater use of clean energy to supply electricity,cut down on environmental pollution and help achieve the carbon peak and neutrality targets. |