Font Size: a A A

Computer Simulation Of Multiple Imputation For Analyzing Parallel Design And Crossover Design With Missing Data In Clinical Trial

Posted on:2006-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:Q H LiFull Text:PDF
GTID:2144360152996285Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Missing values represent a potential source of bias in a clinical trial. Hence, every effort should be undertaken to fulfill all the requirements of the protocol concerning data collection and management. In reality, there will be almost always some missing data. A trial may be regarded as valid, nonetheless, provided the methods of dealing with missing values are sensible, particularly if these methods are defined in the protocol. Unfortunately, no universally applicable methods of handling missing values can be recommended. An investigation should be made concerning the sensitivity of the results of analysis to the methods of handling missing values .The most common designs in clinical trial are parallel design and cross over design . Under the influence of many factors, such as duration time of trial , sickness quality, the effect of clinical trial drug and toxicity of the drug, some patients drop out. Only analyze patients completed the trial can lead to inaccurate conclusion. In order to make full use of theinformation of dropped out cases, effective measures must be taken to handle missing values caused by dropped out cases. At present, the most applicable universal method to handle missing values in parallel design is LOCF (Last Observation Carried Forward) .But some statisticians have doubted the efficiency of LOCF. Unfortunately, no universally applicable method of handling missing values in cross over design can be recommended. Generally, if there are missing values in one period, then all the information of the patient will be deleted, which causes large waste of resources.In the 70,20th century, Donald B Rubin present multiple imputation to handle missing data. Multiple imputation replaces each missing value with a set of plausible values that represent the uncertainty about the right value to impute. The multiply imputed data sets are then analyzed by using standard procedures for complete data and combining the results from these analyses. This process results in valid statistical inferences that properly reflect the uncertainty due to missing values. With the compute methods developing and the corresponding statistical software invented, multiple imputation method is widely used in the fields of biomedicine, behavioral science and social science.This project according to the statistical inferential principle of the multiple imputation, establish linear models and use computer simulations to testify if multiple imputation can be a valid method to handle missing values in parallel design and crossover design. A series of SAS procedures had been provided for sampling and simulating. The main works of the study are as follows:1. For crossover design, by using Monte Carlo simulations togenerate complete data sets with normal distributions , incomplete data sets were produced by deleting several values randomly. Computer simulated the power of test and accurate rate of the complete data set, data set with missing values and multipled data sets according to the fixed linear model parameters and varying linear model parameters.2. For parallel design, by using Monte Carlo simulations to generate complete data sets with normal distributions , incomplete data sets were produced by deleting several values randomly. Computer simulated the power of test and accurate rate of the complete data set , data set with missing values,data set multipled with LOCF and data sets multipled with multiple imputation according to the fixed linear model parameters and varying linear model parameters.A conclusion has been reached through large computer simulations that for 2×2 crossover design, when the real difference between the two populations is zero, the accurate rate of the complete data set and data set with missing values are very high. The accurate of multipled imputed dataset are lower than that of complete data set and data set with missing values. When the difference is existed between the two populations, the power of test of the complete data set is the most high , the power of test of the data set with missing values is the lowest and as the increasing of the missing values, the power of test is decreasing. After the missing data is imputed, the of power test is increased. And as the multiple imputation times increasing, the power test is near the complete data set. For parallel design, after the computer simulation, it can be founded that using LOCF to handle data set with missing values can enlarge type Ⅱ error.In clinical trial, we often need to answer such kind of question if a new...
Keywords/Search Tags:missing data, multiple imputation, Markov chains, Monte Carlo, crossover design, power of test, sample size
PDF Full Text Request
Related items