Font Size: a A A

Study On Design And Statistical Analysis Strategies For Large Sample Longitudinal Health Management Cohort Data

Posted on:2014-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:Q ZhangFull Text:PDF
GTID:2234330398959842Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Health Management, which monitors, analyze, estimate and forecast the individual and population risk factors comprehensively, is a whole process to provide health consultation and health intervention. Generalized health management also covers disease management, which means health management is often a management from low-risk disease status, risk disease status and early pathological changes, while disease management is from clinical syndrome, disease and different outcomes. Although the concept of health management is not perfect and the ideas, the theories and the methods are not mature, the public desires of health management has made it as the area which governments should focus on, support and develop.The concept, development and huge desires of health management need deep researches in the theories, methods and strategies of health management. Among these, large sample longitudinal health management cohorts based on health physical examination is an important platform. Constructing these platforms could not only collect individual information, manage health files, access individual risks, estimate the health interventions, but also clarify the occurrence, development and outcomes of diseases. In this order, our research group has established large sample multicenter longitudinal health management cohorts since2004in Shandong Province, and the author has been the core member of this group since2007, who has participated the data collection, data management, constructing database and cohort follow-ups. So, this paper presents my whole research experiences.This paper gives system research from multicenter large sample longitudinal health management cohort construction, data collection (data clean, integration, data transformation), multiple imputation to data analysis (generalized estimated equation, mixed-effects model, Cox regression and Joint model analysis), and also gives an example about association between serum uric acid and metabolic syndrome to illustrate its application, in order to build design and statistical analysis strategies for large sample longitudinal health management cohort data.Results:1. Built "large sample longitudinal health management cohorts data management system" to transform original data to available data by completing data.dictionary and disease dictionary, managing assignment rules, importing original data, searching and exporting available data.2. Multiple imputation and related diagnosis for missing values, using mi process in S AS and Amelia Ⅱ package in R software.3. After adjusting potential confounding, the association between serum uric acid level and metabolic syndrome has been confirmed in GEE, LME, Cox and Joint models. Comparing with normal level, the people in high level are1.449(95%CI:1.215,1.727).1.527(95%CI:1.187,1.965)、1.496(95%CI:1.287,1.740)和1.3735(95%CI:1.1565,1.6313) times risk to get metabolic syndrome respectively, and these four models all could be used in these data.4. Based on real longitudinal health management cohort data and joint model, we simulated data to compare the type Ⅰ errors, power and estimated bias among these four models. When Ho was accepted, under the0.05test level, the type Ⅰ errors probabilities were all among0.05, and could randomly wave in GEE and Cox, while in other two models are both a little high than0.05. When Ho was not accepted, the power was increasing along with the sample size and partial regression coefficients to100%. These four methods had familiar change patterns, but GEE and LME were higher than Cox and Joint model. In the side of estimated bias, LME was the best, and then were GEE, Cox and Joint.Conclusions:1. Using these strategies, we could solve the problem about longitudinal health management cohort design and data analysis, extent health management from the initial information collecting phrase to the deep phrase of risk assessments, disease forecasting, health improvement and intervention management.2."Large sample longitudinal health management cohort data management system" could integrate health management data from many centers to unified data management platform, and the software view was friendly and easy to operate, which laid a foundation for future imputation and regression.3. Multiple imputation for missing values:mi process based on MCMC algorithm is a classical and general method, but Amelia package based on EMB algorithm is easy to use by AmeliaView windows to import data, impute, export and diagnose. It also can impute many kinds of data, such as cross-section data, time series data and their combination (views as longitudinal data).4. Simulation confirmed how to select regression methods for large sample longitudinal health management cohort data:a) when the sample size is large enough, we could select GEE to estimate the association between variables; b) when the data is multicenter, use LME; c) generally, these two models are both good; d) complex joint model did not give better results when analyzed these data, but it could compute individual disease risk and survival probability in certain time point, and it could be used to risk assessment; e) large bias would occur when used Cox regression separately in these data.5. The example confirmed that the high level UA could increase the MetS risks.Innovations:1. Proposed design and statistical analysis strategies for large sample longitudinal health management cohort data.2."Large sample longitudinal health management cohort data management system" could integrate health management data from many centers to unified data management platform.3. Compared GEE, LME, Cox and joint model by simulation and gave regression strategies.Deficiencies:In MI step, we did not include more methods and diagnosis technologies, and did not compare mi and Amelia methods by simulation; we did not analyze the key factors to impact regression results in mathematical views.
Keywords/Search Tags:Health Management, Multicenter Longitudinal Cohort, Statistical Strategies
PDF Full Text Request
Related items