Font Size: a A A

Analysis of multivariate data with random cluster size

Posted on:2012-10-19Degree:Ph.DType:Dissertation
University:The Florida State UniversityCandidate:Li, XiaoyunFull Text:PDF
GTID:1450390008991113Subject:Statistics
Abstract/Summary:
In this dissertation, we examine binary correlated data with present/absent component or missing data that are related to binary responses of interest. Depending on the data structure, correlated binary data can be referred as clustered data if sampling unit is a cluster of subjects, or it can be referred as longitudinal data when it involves repeated measurement of same subject over time. We propose our novel models in these two data structures and illustrate the model with real data applications.;In biomedical studies involving clustered binary responses, the cluster size can vary because some components of the cluster can be absent. When both the presence of a cluster component as well as the binary disease status of a present component are treated as responses of interest, we propose a novel two-stage random effects logistic regression framework. For the ease of interpretation of regression effects, both the marginal probability of presence/absence of a component as well as the conditional probability of disease status of a present component, preserve the approximate logistic regression forms. We present a maximum likelihood method of estimation implementable using standard statistical software. We compare our models and the physical interpretation of regression effects with competing methods from literature. We also present a simulation study to assess the robustness of our procedure to wrong specification of the random effects distribution and to compare finite sample performances of estimates with existing methods. The methodology is illustrated via analyzing a study of the periodontal health status in a diabetic Gullah population.;We extend this model in longitudinal studies with binary longitudinal response and informative missing data. In longitudinal studies, when treating each subject as a cluster, cluster size is the total number of observations for each subject. When data is informatively missing, cluster size of each subject can vary and is related to the binary response of interest and we are also interested in the missing mechanism. This is a modified situation of the cluster binary data with present components. We modify and adopt our proposed two-stage random effects logistic regression model so that both the marginal probability of binary response and missing indicator as well as the conditional probability of binary response and missing indicator preserve logistic regression forms. We present a Bayesian framework of this model and illustrate our proposed model on an AIDS data example.
Keywords/Search Tags:Data, Cluster, Present, Binary, Missing, Regression, Random, Model
Related items