| Since the establishment of the capital asset pricing model and option pricing formula,volatility has played a vital role in the fields of risk management,derivative product pricing and asset portfolio.Later,with the rapid development of highfrequency trading,the formation mechanism of the intraday price of financial assets has also undergone great changes.Since intraday financial high-frequency data contains a lot of rich information,it is generally used to estimate the volatility of financial assets.In recent years,artificial intelligence technology has become popular.With the development and innovation of many theories of artificial intelligence,deep learning models based on neural networks have been widely used in the financial field,especially in financial time series forecasting.In this paper,taking the realized volatility sequence of CSI 300 Index as the research object,a realized volatility prediction model based on EEMD-LSTM is constructed.The sample data comes from the high-frequency data of all trading days of the CSI 300 Index from January 4,2012 to December 30,2021,and the sampling frequency of the data is 5 minutes.Firstly,through the ensemble empirical mode decomposition of the realized volatility sequence,the components with different oscillation frequencies are obtained.Then,the zero-mean significance test is carried out on the component series at a significance level of 5%,and the component series are reconstructed according to the test results,and then the high-frequency series,low-frequency series and residual items decomposed based on the realized volatility are obtained.Finally,different reconstructed sequences are used as input variables,and the LSTM models are respectively fitted and summarized.The predictive effect was compared and analyzed with the GARCH model,SVM model,LSTM model without EEMD decomposition and the EEMD-GARCH model,EEMD-SVM model after EEMD decomposition.Through the empirical analysis,the following main conclusions are obtained in this paper:(1)Through the descriptive statistics and trend analysis of the realized volatility of the CSI 300 Index,this paper verifies that the volatility of the Chinese stock market has dynamic characteristics such as volatility aggregation and long memory.Secondly,after performing ensemble empirical mode decomposition on the daily realized volatility,9 intrinsic mode functions and 1 residual term are obtained.By observing the distribution diagrams of each component,it can be seen that the first6 components have higher oscillation frequencies,and the latter 4 components have lower oscillation frequencies.In addition,hypothesis testing on the component series found that series with lower frequency of oscillation generally rejected the zero-mean hypothesis test,while series with higher frequency of oscillation did not.So on this basis,this paper reconstructs each component sequence,and conducts descriptive statistics analysis on the reconstructed sequences.In the distribution diagrams of the reconstructed sequences,we found that the trend of the lowfrequency sequence is very similar to the trend of the initial time series,which shows that the trend of the low-frequency sequence is very significant,and also shows that the low-frequency series contains more information related to volatility forecasting,while the residual term shows the overall trend of realized volatility.The above results show that the characteristics of the original time series are successfully extracted by the ensemble empirical mode decomposition,which shows the effectiveness of this method in dealing with nonlinear and non-stationary time series.(2)In this paper,the sequences reconstructed by ensemble empirical mode decomposition(high-frequency sequence,low-frequency sequence and trend item)are used as the input feature of the model to perform out-of-sample one-step,threestep,and five-step rolling forecasts on realized volatility,respectively,by comparing the size of the loss functions,the study found that the prediction effect of the combined model(EEMD-GARCH model,EEMD-SVM model,EEMDLSTM model)after ensemble empirical mode decomposition is better than that of the non-combined single model without ensemble empirical mode decomposition(GARCH model,SVM model,LSTM model),and in the combined models,the predictive effect of EEMD-LSTM model is better than that of EEMD-GARCH model and EEMD-SVM model.Finally,in order to avoid the influence of some outliers in the data samples on the calculation results of the loss functions,this paper conducts an SPA test on the above-listed models.The test results show that the EEMD-LSTM model is still superior to other models. |