Font Size: a A A

Statistical Inference For Several Correlation Relationship

Posted on:2024-11-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:J Y YangFull Text:PDF
GTID:1520307373969249Subject:Mathematics
Abstract/Summary:PDF Full Text Request
Correlation measures,as a crucial statistical tool to explain the correlation of variables,has been widely used in statistical applications and modern scientific research.However,with the development of science and technology,the complexity of data structure and data dimensionality continues to increase,and the traditional correlation measure methods face the challenge of low algorithmic efficiency or limited applicability.This dissertation,based on three correlation measures perspectives of phase difference,partial correlation coefficient and covariance matrix,mainly studies the following statistical problems:(1)The estimation of the phase difference(PD)between periodic time series and its statistical inference is studied.While current literature predominantly concentrates on period estimation and periodic function estimation of individual periodic time series,the potential lead-lag relationships among similar periodic time series are often overlooked.The phase difference is an essential measure of the lead-lag relationship between two or more time series.However,conventional signal processing methods for estimating PD are inadequate for time series based on complex periodic functions,and the underlying statistical theory for the PD inference remains underdeveloped.Therefore,this dissertation introduces the Nadaraya-Watson estimator under the circular distance and circular kernel function,proposes a nonparametric method for estimating the PD and the periodic function with two-step iteration; and introduces the Bootstrap method to perform the significance inference of the PD.Meanwhile,this dissertation also proves the asymptotic normality of the estimated parameters and the periodic function.In addition,numerical simulations and real data analyses demonstrate the superiority of the proposed method over other estimation methods in terms of estimation error and stability for multiple periodic functions,and validate the utility and interpretability in air pollution data and measles epidemiological data.(2)The estimation of the high-dimensional partial correlation coefficient(Pcor)is investigated.The Pcor quantify the correlation between two random variables when removing the effect of controlling variables.In high-dimensional controlling variable scenarios,traditional least squares estimation is inadequate,leading to the proposal of various Pcor estimation methods suitable for such high-dimensional conditions.However,most of the existing literature focuses on testing whether the Pcor is zero.The accuracy and efficiency of estimated Pcors are not sufficiently investigated,and there is a lack of systematic summary and evaluation of these methods.In this dissertation,we organize existing Pcor estimation methods,integrate them with regularization methods for highdimensional controlling variables applicability,and provide the specific implementation procedures and time complexity.Extensive simulation studies discuss the algorithmic efficiency of the listed Pcor estimation methods under sparse and non-sparse conditions respectively,and find that the listed methods generally have a systematic bias in estimating the high-dimensional Pcor,i.e.,the absolute value of the estimated Pcor tends to be lower than its true value,especially when the Pcor is positive.The quadratic regression construction method is shown to effectively mitigate the impact of control variables,thereby diminishing estimation bias.Furthermore,the real data analysis of stock data verifies the applicability and validity of Pcor estimation methods and reveals a connection between the significant fluctuations in the estimated Pcors and the global financial crisis.(3)The estimation of high-dimensional covariance matrices is studied.For highdimensional datasets,traditional sample covariance matrices may not be applicable,and researchers have developed various methods for estimating high-dimensional covariance matrices.However,many existing methods in the literature assume a sparse covariance matrix,neglecting the challenge of non-sparse covariance matrices.This is particularly pertinent in the realm of finance,where variables frequently exhibit dense interdependencies due to shared influence factors.In this dissertation,we propose a method for estimating high-dimensional covariance matrices based on matrix element aggregation,which is applicable to both sparse and non-sparse matrices,with concise implementation steps and low computational complexity.This dissertation demonstrates the consistency of the proposed method and its convergence rate in specific scenarios.Furthermore,the proposed method outperforms existing covariance matrix estimation methods by reducing relative estimation errors in numerical simulations.Empirical studies on financial portfolio optimization demonstrate significant risk management benefits of the proposed method,especially its effectiveness in reducing portfolio risk despite small sample sizes.
Keywords/Search Tags:covariance matrix, high-dimensional data, nonparametric regression, partial correlation coefficient, phase difference
PDF Full Text Request
Related items