Font Size: a A A

Statistical Analysis And Modeling For Complex Data

Posted on:2018-02-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:J F LiuFull Text:PDF
GTID:1310330563452534Subject:Statistics
Abstract/Summary:PDF Full Text Request
In statistical research,the analysis and modeling of complex data has been widely concerned,the complex data considered in this paper includes longitudinal data,missing data and measurement error data.The key of longitudinal data analysis is to develop statistical models that taking into account the nature of the dependence among the measurements.Generalized estimating equations are often used to analyze such data.Missing data arises very often in longitudinal stud-ies because of various subjective and objective reasons.When the missingness probability depends on the observed data,usual estimating equations are biased and hence fail to provide consistent estimates.Inverse probability weighted gen-eralized estimating equations approach can produce consistent estimates of the parameters,where the contribution from each subject is adjusted according to the probability of presence.It is common in practice that some covariates are subject to error because of its own nature or the mechanism of measurement.Statistical analysis of measurement error data is also very necessary.More specifically,the research contents of this thesis have the following four parts:For longitudinal data with monotone missing response variables in linear model,we propose an estimation method based on the quadratic inference function and inverse probability-weighted generalized estimating equations for regression coefficients in model.The new method can handle the correlation within sub-jects without involving direct estimation of nuisance parameters in the working correlation matrix.Under some regular conditions,we show that the quadratic in-ference function estimator is consistent and asymptotically normal.Finite sample performance is assessed through simulation studies and real data example.For longitudinal partial linear models with covariate that is measured with error,we propose a generalized empirical likelihood method to estimate the para-metric component and nonparametric component,based on correction attenuation and quadratic inference functions.We define a generalized empirical likelihood-based statistic for the regression coefficients and residual adjusted empirical like-lihood for the baseline function.The empirical log-likelihood ratios are proven to be asymptotically chi-squared,and the corresponding confidence regions are then constructed.Compared with methods based on normal approximations,the gen-eralized empirical likelihood does not require consistent estimators for the asymp-totic variance and bias.Furthermore,a simulation study is conducted to evaluate the efficiency of the proposed method.For longitudinal partial linear models when the response variable is sometimes missing with missingness probability depending on the covariate that is measured with error,We modulate the missingness probability based on the observed surro-gates.This treat of the missing data process enables us to build a more sensible model and allows more transparent interpretation of model parameters.We use a logistic regression model to posit the missing data process.The proposed method that taking into consider the correlation within groups is used to estimate the regression coefficients and residual adjusted empirical likelihood is employed for estimating the baseline function so that undersmoothing is avoided.The empirical log-likelihood ratios are proven to be asymptotically chi-squared,and the corre-sponding confidence regions for the parameters of interest are then constructed.Compared with methods based on normal approximations,the generalized empir-ical likelihood does not require consistent estimators for the asymptotic variance and bias.Numerical study and a real example application show that the proposed method performs well.We focus on the problem of smooth-threshold variable selection for the par-tially liner models with monotone longitudinal data.The proposed method is based on smooth-threshold inverse probability-weighted generalized estimating equations.The procedure can automatically eliminate the irrelevant covariates by setting the corresponding coefficient functions as zero,and simultaneously es-timate the nonzero regression coefficients by solving the smooth-threshold inverse probability-weighted generalized estimating equations.The outstanding merit of this new procedure is that it avoids the convex optimization problem and is easy to implement.Under some regularity conditions,the resulting estimator possesses the consistency in variable selection and the Oracle property in estimation.Sim-ulation studies are conducted to examine the finite sample performance for the proposed method.
Keywords/Search Tags:Longitudinal data, Inverse probability-weighted generalized estimating equations, Quadratic inference functions, Generalized empirical likelihood, Variable selection
PDF Full Text Request
Related items