An Imputation-Estimation Algorithm Using Time-Varying Auxiliary Covariates for a Longitudinal Model When Outcome is Missing by Design

Posted on:2013-06-30

Degree:Ph.D

Type:Dissertation

University:The George Washington University

Candidate:Temprosa, Marinella Gracia Montealegre

Full Text:PDF

GTID:1450390008979677

Subject:Biostatistics

Abstract/Summary:

In long term clinical trials, occurrence of missing data is an area of concern especially if the rate at which data are missing depends on the treatment group. Typically, some effort is spent on trying to identify the reasons the data are missing so that appropriate assumptions and analytic approaches can be properly applied. When data are missing by design, certain measurements are discontinued after meeting an endpoint, possibly due to ethical or financial constraints. Subjects who reach the absorbing barrier may stop data collection on some variables but may subsequent time-varying covariates available from continued follow-up. In this dissertation, we developed an Imputation-Estimation algorithm under an auxiliary missing at random assumption to assess whether the additional information from the time varying covariates can be used to improve estimation. Quality of estimates is evaluated in terms of bias, variance and coverage for the estimates of the parameters of interest. We contrast this method to other missing data approaches such as multiple imputation and available case analysis.;We illustrate this method using data from the Diabetes Prevention Program (DPP). The DPP was a diabetes prevention study that showed reductions of 58% and 31% in diabetes risk using intensive lifestyle or metformin interventions compared to placebo. According to the DPP protocol, the oral glucose tolerance test is discontinued after diabetes diagnosis. Because of the significant reduction in diabetes incidence by the metformin and lifestyle interventions, the rates of missing IGR and CIR are different among the treatment groups. This differential discontinuation among treatment groups results in informative monotone missing assessments of 30 minute glucose and insulin values. These 30 minute values are used to calculate surrogate measures of insulin secretion such as Insulin Glucose Ratio (IGR = (30-min insulin - fasting insulin)/(30-min glucose - fasting glucose)). Fasting blood glucose is collected at all time points and is associated with 30-minute glucose. The imputation estimation algorithm is applied to estimate the mean 30 minute blood glucose utilizing auxiliary information from the fasting blood glucose. In this example, fasting glucose is also the source of the discontinuation since diabetes diagnosis is based on the fasting glucose and 2 hour values during the OGTT. Because of the strong dependence between the fasting and 30 minute glucose measured at the same visit, the resulting estimates from the IE algorithm using the complete vector were similar to multiple imputation. Because the Placebo group experienced higher rates of diabetes incidence, the difference between available case analysis and the regression based imputations were greater than in the lifestyle group.

Keywords/Search Tags:

Missing, Imputation, Data, Diabetes, Algorithm, Glucose, Using, Covariates

Related items

1	Missing covariates in causal inference matching: Statistical imputation using machine learning and evolutionary search algorithm
2	Comparison And Empirical Analysis Of Imputation Methods For Missing Data
3	Imputation For Missing Value Of Compositional Data Based On Biclustering Algorithm
4	Extension of the Regression Method for Imputation of Data with Monotone Missing Pattern using Multivariate Adaptive Regression Splines (MARS), with Applications to Systematic- Missing-At-Random (SMAR) Study Design
5	Imputation Methods Of Missing Values For Compositional Data
6	Simulation Of The Missing Data Imputation Methods For The Regression Model
7	Statistical Inference Of Heteroscedasticity Modal Based On Missing Skewness Data
8	Analysis of failure time data under risk set sampling and missing covariates
9	Missing Value Imputation Study For Typical High-throughput Omics Data
10	Statistical Inference Of Cox Model Under Case? Interval-Censored Failure Time Data With Missing Time-Dependent Covariates