Font Size: a A A

Research On Joint Mean-Covariance Modelling For Longitudinal Data Within The Framework Of Generalised Estimating Equations

Posted on:2022-04-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:N LiFull Text:PDF
GTID:1487306485474684Subject:Statistics
Abstract/Summary:PDF Full Text Request
Longitudinal data is data that is repeatedly measured over time and space for each subject.A very important feature of longitudinal data is that the observations from different subjects are independent,but the observations for the same subject are intrinsically correlated.Therefore,the correlation within the subject should be fully considered in the longitudinal data study.The traditional methods of longitudinal data study are mainly based on mean of response under a given structure of covariance and distribution assumption(i.e.,the covariance structure is known,but the value of parameter is unknown).Nevertheless,it is difficult to verify the consistency between the distribution of collected data and the assumption although it is simpleness and convenience.Moreover,misspecification of covariance structures may seriously affect statistical inferences,e.g.,the efficiency of mean parameter.Therefore,modeling mean and covariance without making any distributional assumptions is an important and challenging problem for longitudinal data study.At present,a large number of literatures have research on joint mean-covariance modelling for longitudinal data without making any distributional assumptions.Those works built joint generalized estimating equation(GEE)under the different decomposition of the covariance matrix.The advantage of using GEE is that it only need the low-order moments assumptions,but not any distributional assumptions.It is significant to quickly and effectively screen out the variables,which have a greater impact on the response.However,there are few researches on the relevant variable selection methods.It is necessary to establish an efficient method that enjoys the oracle property and is easy to implement.Moreover,it is another urgent problem to judge the rationality of a given model and overcome possible deviations between actual data and a given model when the joint mean-covariance model was given.The statistical diagnosis and robust statistics can be used to solved these problems.However,there is little information on the statistical diagnosis and robust statistics for joint mean-covariance modelling of longitudinal data within the framework of GEE.In this study,we mainly studied three problems,including variable selection,statistical diagnosis and robust statistics for mean-covariance modelling of longitudinal data within the framework of GEE.Firstly,we developed a variable selection procedure using smooth-threshold joint generalized estimating equations(SJGEE)based on the modified Cholesky decomposition(MCD)of covariance matrix.The procedure can automatically eliminate inactive predictors by setting the corresponding parameters to be zero,and simultaneously estimate the nonzero coefficients by SJGEE.A penalized weighted deviance criterion and the Newton-Simpson algorithm were used to obtain the optimal tuning parameters and the sparse solutions of equations,respectively.Additionally,under some regularity conditions,we studied the oracle property of the developed model selection method and the large sample properties of the estimators including of the consistency and the asymptotic normality.The developed procedure is not only flexible,but also easy to implement,because it can avoid the convex optimization problems in the penalized variable selection methods.Finally,the good features of the developed method is demonstrated in simulation study and real datasets analysis.Secondly,the “case deletion” statistical diagnosis was established for the mean-covariance model within the framework of GEE under the MCD of covariance.The diagnostic procedure was aimed to detect and identify outliers and influence points in the real data sets.We firstly given computationally feasible one-step approximation formulaes for the mean,generalised autoregressive parameters and innovation variances.These formulaes were used for calculation of the estimator change caused by deleting an arbitrary observations of subject.At the same time,the Correlation information criterion(CIC)and the pseudo-Fisher information matrix were applied to construct the generalized Cook distance.The generalized Cook distance for the full parameter vector can be orthogonally decomposed into three terms inculding the mean,generalised autoregressive parameters and innovation variances.Finally,the efficiency and effectiveness of the statistical diagnosis in the identification of influential subjects were exhibited by simulation studies and real data analysis.At the last,we proposed a robust mean-covariance model for longitudinal data with autoregressive and moving average(ARMA)error process.This model was constructed by the Mallows-type weights and a bounded score function on the Pearson residuals.These methods efficient reduced the effect of leverage points and outliers.Under some regularity conditions,the resulting estimators for the regression coefficients in both the mean and covariance were verified to be consistent and asymptotically normally distributed.Simulations studies and real data analysis including non-contaminated cases and contaminated cases were conducted to assess the performance of the robust method.Our results shown that it has good performance under different contaminations and distributions.Especially,the robust method outperforms its non-robust version when the data sets contain contaminations.In summary,we firstly developed the smooth-threshold joint generalized estimating equations based on the MCD of covariance.It effectively solved the parameter estimation and variable selection for the joint modeling of longitudinal data within the framework of GEE.Meanwhile,it is easy to implement in practice.Secondly,Generalized Cook diagnostic statistics for the developed joint mean-covariance model are established in the framework of GEE.This diagnostic statistics can quickly and effectively identify the outliers and influence points in the real data sets.Additionally,we proposed the robust estimation for the mean and covariance jointly under a general decomposition of covariance matrix in the GEE.This established method overcomes the effects of outliers and influence points on the statistical inference.These work is helpful for theoretical and practical study of longitudinal data in future.
Keywords/Search Tags:Longitudinal data, Joint mean-covariance model, Generalized estimating equations, Model selection, Statistical diagnosis, Robust statistics
PDF Full Text Request
Related items