Font Size: a A A

Research Of The Optimal Number Of Groups And Grouping Uncertainty In Group-based Trajectory Modeling

Posted on:2023-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:C X ZhangFull Text:PDF
GTID:2544306614481734Subject:Public health
Abstract/Summary:PDF Full Text Request
Background:Developmental trajectory is composed of a series of observed values generated from time-dependent variables.As a kind of methods that used to model the longitudinal data,the developmental trajectory can dynamically describe the variation trend of indicators,for example the development trajectory that generated from temperature records during hospitalization.Compared with the traditional cross-sectional variables,developmental trajectories contain more time information,which is of great significance in revealing real-time changes in patients’ conditions and prognosis.Group-based trajectory modeling(GBTM)is a method wich can be used to deal with development trajectories.Deriving from the finite mixture model,this model takes a polynomial form of time to define trajectories,so it could flexibly fit the various trends of development trajectory by adjusting the polynomial coefficients.Most importantly,it could realize longitudinal clustering of indicators by identifying individuals with similar development trajectories.Because of this characteristic,the model has been used more and more widely in recent years.During the modeling,it is very important to select the optimal number of groups.This procedure is usually called the classification enumeration.The optimal number of groups obtained from this process will not only affect the data fitting and individual inclusion probability,but also influence the subsequent analysis.However,there is no indicator that can be used as the golden standard for the optimal group number selection.Most of the published literatures only report BIC and AIC.However,there are more criteria could be available as the indicator of optimal number selection procedure.After the optimal number of groups are determined,the model will assign individuals to certain group because of their similarityof the logitudinal trajectory.However,the chance of a certain individual that assigned to one of the groups is not determined.It is a series of posterior probabilities to each individual,corresponding to each subgroup respectively.This will bring the problem of grouping uncertainty.How much degree of this uncertainty will bring to subsequent analysis in the application of GBTM and whether it is necessary to adjust this uncertainty and how to select the adjustment conditions have not yet been clear in literature.Traumatic brain injury(TBI)is a kind of serious traumatic disease with high morbidity and mortality.How to manage these patients in intensive care unit(ICU)always a popular research area.Blood glucose concentration for TBI patients often change dramatically after acute traumatic injury.Analysis of blood glucose developmental trajectories after admission to the ICU may be helpful in risk stratification and prognosis assessment.Objective:1.All the available criteria that used to determine the optimal number of groups in the GBTM was compared under combined conditions by simulation study.The appropriate condition for each criterion is discussed.2.The grouping uncertainty problem in GBTM is studied by simulation.The degree of uncertainty and its influence to the effect size estimation is quantified in various conditions.Two approaches that used to adjust the uncertainty are introduced,and the performance and application conditions of the two methods are compared by simulation.3.Blood glucose trajectories of TBI patients at 72 hours after admission to the ICU was studied to explore whether there were heterogeneous blood glucose trajectories.The baseline characteristics of various trajectories and their correlation with prognosis during the hospitalization were explored.Method:1.To solve the problem of optimal group number selection of the model,12 criteria are compared in this study.They are two BIC(Bayesian Information Criterion)indicators,AIC(Akaike Information Criterion),AICc(Correction of Akaike Information Criterion),CAIC(Consistent Akaike Information Criterion),ABIC(sample-size Adjusted Bayesian Information Criterion),Entropy,Modified entropy,HQ(Hannan-Quinn information criterion)、NEC(Normalized Entropy Criterion)、CLC(Classification Likelihood Criterion)and ICL-BIC(Integrated Completed Likelihood),respectively.Simulation conditions included different sample size and follow-up time,number of groups,degree of polynomial,the vertical distance between groups,the allocation proportion,a total of 432 kinds of combinations.The accuracy of each criterion was judged by comparing the number of selected groups with the real groups.2.In order to quantify the degree of influence of grouping uncertainty on the effect size estimation,216 combinations with different sample size,number of groups,degree of polynomial,vertical distance and allocation proportion between groups are set in the simulation,and the bias of effect size estimation by maximum posterior probability grouping is studied in all cases.Two approaches that used to deal with uncertainty were introduced: the inverse matrix method and pseudo-class draws method.The performance and applicable conditions of each method are studied.3.The results of simulation studies are used to the real-world example.Demographics,physiological indicators,disease conditions,hospital death outcomes and longitudinal blood glucose of patients with TBI are extracted from MIMIC-IV(Medical Information Mart for Intensive Care IV)database.GBTM was used to model the blood glucose trajectories.Results:1.Selection of the optimal number of groupsAmong the 12 indicators criteria,the correct rate of restoring the real group number for each criterion was BIC1(60.00%),BIC2(57.74%),AIC(62.46%),AICc(63.26%),CAIC(58.69%),ABIC(64.92%),HQ(61.37%),NEC(33.97%),CLC(36.85%),ICL-BIC(37.19%),entropy(24.56%),modified entropy(23.46%),respectively.Specifically,with a small sample size,when the number of groups was small,criteria with the highest accuracy rate were AICc,ABIC and BIC1;when the number of groups is large,criteria with the highest accuracy rate were HQ,AIC and ABIC.However,when the sample size was relatively large,ABIC,BIC and CAIC performed well in small number of groups,while AICc,HQ and AIC had better accuracy rate in large number of groups.The accuracy rate of each criterion at different follow-up time points is similar to that of different sample sizes.When both the distance between trajectory groups and the number of groups were small,ABIC,AICc and AIC had better performance.With the number of groups increased,the accuracy of ABIC decreased rapidly.However HQ showed good performance,while BIC and CAIC showed better results when the vertical distance between groups was large.When the samples were similar among groups,ABIC,BIC and CAIC performed better under the condition of small number of groups,and AICc,AIC and HQ performed better under large number of groups.When the sample size were not balanced,ABIC,BIC,CAIC,HQ,AIC and AICc were the best criteria in the both of small and large number of groups.2.Simulation results of grouping uncertaintyThe effect size estimation without considering the grouping uncertainty will introduce bias.However,the magnitude and direction of the bias were determined by various conditions.When the entropy value,sample size and distance between groups increased and the groups were more balanced,the bias of effect size decreased.Meanwhile,with the increase of entropy and distance between groups and the decrease of sample size,the relationship between the estimated value and the real value were more robust.The results showed that in the case of the two-group conditon,when the entropy was low(<0.8),the bias of the effect size was relatively small for the unadjusted condition.However,when the entropy value was high(>0.8),the point estimation of the inverse matrix method was superior to the unadjusted method in some combinations.In terms of the point estimation of the effect size,the OR value estimated by the unadjusted method was closer to the true value in most cases,and this was more significant as the sample size increaseed.In the three-group models,when the entropy value was between 0.4 and 0.8,the estimates obtained by the adjusted methods showed less bias under some conditions.Meanwhile,the direction of bias,the proportion of best estimate and the coverage of confidence interval were also improved.In the four-group models,performance of the adjusted methods improved significantly.In many simulation combinations,the bias of point estimation and the proportion of the optimal estimation from the inverse matrix method were better than that of the unadjusted method.3.Blood glucose trajectory in patients with TBIThe blood glucose trajectory of patients with TBI in ICU was modelled by GBTM.The criteria used to select the optimal number of groups were comprehensively considered.Finally,models with three subgroups were determined.Group 1 contained 515 patients(62.05%),and the blood glucose trajectory of patients in this group was stable at a relatively low concentration over time.Group 2 included 252(30.36%)patients,the blood glucose trajectory was similar to that of group 1,but at a higher level of concentration.Group 3 included 63 patients(7.59%)who had the highest blood glucose levels at the beginning of the trajectory but declined significantly over time and stabilized at a relatively high concentration about 12 hours later.In the further multivariate analysis,group 2 showed higher risk than group 1 under the condition of unadjusted and adjusted grouping uncertainty,with OR values of2.11(1.36-3.28)and 1.91(1.18-3.20),respectively.Conclusion:The simulation results of this study showed that ABIC,BIC,AIC,AICc,CAIC and HQ had good performance in the accuracy rate for optimal group number selection,and there were differences in the performance of each indicator in different simulation combinations,while entropy,modified entropy,NEC,CLC and ICL-BIC had relatively poor performance.Based on the simulation results,one appropriate process for the selection of the optimal group number may be like this.Firstly,six criteria with good performance were used for comprehensive judgment,and the general impression of the number of groups was obtained.Then,fine-tuning was carried out according to specific sample size,the number of groups and the allocation proportion of each group.In addition,entropy and entropy-related criteria performed poorly in the simulation,suggesting that it may not be a good strategy to use entropy penalty.The result suggested that we should be more cautious of the results from these criteria.When the model was directly processed by the maximum posterior probability,the estimated effect size had certain degree of bias.The size and direction of the bias would be affected by various conditions.With the increase of entropy,sample size and distance between groups and the more balanced between groups,the bias estimation of effect size would be decreased.Meanwhile,with the increase of entropy and distance between groups and the decrease of sample size,the relationship between the estimated effect size and the real values would be more robust.Based on the simulation results,we found that three scenarios should be distinguished as to whether the grouping uncertainty needed to be adjusted or not:(1)When the model had two groups,the groups obtained from the maximum posterior probability method may be a better choice when the entropy value is less than 0.8;(2)when the entropy value is greater than 0.8,both adjusted and not adjusted could be referred to,and the unadjusted results could be taken as the primary reference;(3)When the model had three subgroups,the adjustment method should be applied when the entropy value was greater than 0.4,and the larger the entropy was,the higher of the reference value of the adjusted results would be.When the model had four groups,it was necessary to adjust the grouping uncertainty in the model,especially when the entropy is high(>0.8),the conclusion should based on the adjusted results.Therefore,although the entropy had little performance in selecting the optimal number of groups,it was recommended to report the entropy value as a routine indicator in the practical use of GBTM model due to the entropy is useful in the adjustment of grouping uncertainty.In the real world example,there was a correlation between blood glucose trajectory and prognosis in patients with TBI.Continuous blood glucose monitoring in patients with TBI may help to predict clinical outcomes.At the same time,patients with persistently high blood glucose trajectory during treatment might require more attention.
Keywords/Search Tags:group-based trajectory modeling, number of groups, posterior probability, traumatic brain injury
PDF Full Text Request
Related items