Font Size: a A A

Statistical Inference Of Several High-dimensional Time Series Models

Posted on:2021-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:H J ZhengFull Text:PDF
GTID:2480306050472564Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Time series data and its model are widely used in various fields,and are the research hotspot.With the improvement of data collection ability,the dimension of variables is higher.Traditional time series models are not suitable to analyze such high-dimensional data.Especially,for high-dimensional component time series,the observation vector at each time point is component data,which does not conform to the distribution characteristics of traditional time series model.For this reason,the classical method is to do logarithmic transformation to the data to solve the data structure constraints.But when the data is zero-inflated,the logarithmic transformation is not feasible.Secondly,although the classical dynamic factor model can extract a small amount of common factors which contain a lot of information from the time series data,and analyze the model by studying these common factors,but when the data dimension is high,the current estimation method can not get the sparse estimation of the factor loading matrix.In order to solve the above two kinds of problems,this paper mainly studies the following three aspects:(1)For the high-dimensional component time series,assuming that the data obey the Dirichlet distribution,and the expectation of the observation vector at each time is linearly related to the observation value at the previous time,the sparse estimation of the influence relation matrix can be obtained by maximizing the logarithmic likelihood function.In order to study the accuracy of model estimation,two groups of experiments are set up.One is the low dimension case,and DFGS algorithm is found to be the best and relatively stable.The other is the high dimension case,and it is found that the elastic network has the best penalty effect and the lowest error rate.The model was applied to the time series data of bacterial composition in women's vagina,and the relationship between bacteria and bacterial vaginitis was studied.(2)For the component time series,in order to solve the problem of zero-inflated data in high-dimensional case,a new transformation is proposed,and two new calculation methods are defined.Based on the least square algorithm with penalty,the sparse estimation of influence relation matrix is obtained.In the case of low dimension,DFP algorithm is the best,and the steepest descent method has the most stable performance.In the case of high dimension,the elastic network has the best punishment effect.The model was also applied to the data of bacteria in female vagina to study the relationship between bacteria.(3)For the high-dimensional dynamic factor model,in order to solve the sparse estimation problem of factor loading matrix,based on Kalman filtering and smoothing,the specific steps of the maximum expectation regularization algorithm(ERM)are derived.The experimental results show that the adaptive penalty function is the best.Applying the model to the stock data of Shenzhen and Shanghai Stock Exchange,the models of return and volatility of financial market are established.According to the distribution of non-zero elements of factor loading matrix,it is found that there is a common factor affecting the vast majority of stocks and some business factors affecting a part of stocks in both models.Combined with domestic and foreign current affairs and politics,this paper analyzes the fluctuation trend of these factors and gives possible explanations.
Keywords/Search Tags:High dimensional time series, Component time series, Dirichlet distribution, Zero-inflated data, Dynamic factor model
PDF Full Text Request
Related items