Font Size: a A A

A Study On China's GDP Forecasting Based On Shapley Regression Framework

Posted on:2022-09-25Degree:MasterType:Thesis
Country:ChinaCandidate:L YeFull Text:PDF
GTID:2480306521471084Subject:Finance
Abstract/Summary:PDF Full Text Request
The inevitability of economic development makes economic forecast possible,while the contingency of economic development means that there must be errors in economic forecast.Economic forecast is an important basis for the government and enterprises to make economic decisions,formulate plans and manage the economy,so the accuracy of economic forecast is very important.In the face of the complexity of domestic and foreign economic environment,machine learning models often have better prediction performance in the field of prediction.Although the prediction performance of the machine learning model is indeed good,due to the complex and opaque structure of the model,only the pros and disadvantages of the prediction can be obtained,and the influence degree and significance of the influencing factors on the explained variables cannot be obtained.That is,machine learning models trade off interpretability against accuracy.In addition,machine learning models are mostly non-parametric models.As a result,the"black box problem"of machine learning models prevents their practical application,and the application of opaque machine learning models can lead to ethical,security,privacy,and increasingly legal issues.The main research idea of this paper is to use nested cross validation method to train neural network,support vector machine,random forest,extreme decision tree and XGBoost model,and to compare the prediction accuracy of machine learning model and linear regression model in long-term prediction and short-term prediction by means of mean square error and R~2.After determining that the prediction performance of machine learning model is indeed better than that of linear regression model,the Shapley values of random forest,extreme decision tree and XGBOOST models,which have great improvement in the prediction performance of both long-term prediction and short-term prediction,are calculated.Then the Shapley values at different times were used to compare the size of the influencing factors at different times and the mean absolute Shapley values of the whole prediction interval were used to compare the size of the influencing factors.Then Shapley regression is used to connect machine learning model and econometrics to analyze the degree and significance of the impact of Treasury yield spread,physical capital,degree of openness to the outside world,human capital,technology,stock market size,liquidity and financing rate on GDP.There are six main findings in this paper.The first is that the use of higher frequency mixed-frequency data improves the accuracy of predictions.The second point is that the development of the stock market is not in line with the development of the economy.The third point is in long-term and short-term prediction,forecast of machine learning model of support vector machine(SVM)and there is no better than the linear regression model,decision tree and neural network,random forests and the extreme XGBOOST model prediction accuracy is higher than the linear regression model,decision tree and random forests,extreme XGBOOST prediction accuracy improved the most.Fourth,human capital is the most important influencing factor in both long-term and short-term forecasting,while physical capital is the most important influencing factor in long-term forecasting.The fifth point is that the size of the stock market cannot be used for economic forecasting in both long-term and short-term forecasting.Sixth,although the degree of opening to the outside world,the spread of national debt yield and the financing rate of the stock market contribute little to the long-term and short-term forecasts,the impact of these variables on economic development is very significant,so they can provide useful signals for economic growth.This paper studies innovation in three aspects.First,this paper uses mixed frequency data of higher frequency to raise the frequency of influencing factors to daily degree,which can predict GDP at high frequency and in time.In this paper,daily,monthly,quarterly and annual data are used to fill the monthly,quarterly and annual data into daily data to improve the timeliness of the forecast.Second,although predecessors also used machine learning models to predict China's economy,they paid more attention to the further improvement of machine learning models,and neither compared linear regression models nor other machine learning models.In this paper,the linear regression model is used as the benchmark model to compare the prediction accuracy of six models,including elastic net,neural network,support vector machine,random forest,extreme decision tree and XGBoost.Thirdly,when machine learning model is used in previous studies,it can only give the accuracy of economic prediction of the whole model without analyzing the influence degree and significance of influencing factors on economic growth.The reason for this is that machine learning models trade off interpretability against accuracy,and higher prediction accuracy means less interpretability as the model becomes more complex.In this paper,Shapley regression is used to solve the inexplicable problem of machine learning model,and the role of predictors in prediction is analyzed while improving the accuracy of prediction.Shapley regression is a general framework for statistical reasoning of nonlinear models,especially for machine learning models.The basic idea is to formulate a regression problem in the transformation input space defined by Shapley decomposition of the model.Shapley regression provides the interpretability of single model prediction,and also opens the door of parameter statistics for the uninterpretable problem of machine learning,and provides a template for more econometric techniques to be applied to machine learning models in the future.
Keywords/Search Tags:GDP, Machine learning, Shapley value, Shapley regression
PDF Full Text Request
Related items