| ObjectiveThe predictors of GDM in early pregnancy were identified through general demographic data,nutritional diet intake,and routine biochemical indices.A simple,easy-to-operate,low-cost,and low-trauma GDM prediction model was constructed to provide theoretical guidance for early pregnancy screening,identifying GDM high-risk groups,and implementing intervention measures.To explore the best modeling method for the GDM prediction model.In order to reduce the prevalence of gestational diabetes mellitus and the incidence of adverse pregnancy outcomes.MethodIn this study,pregnant women who were registered in the obstetric outpatient department of the Maternal and Child Health Hospital of Longgang District,Shenzhen City,Guangdong Province were included as study participants.The general demographic data,dietary intake,and blood biochemical indices of pregnant women were collected at 6~13+6weeks of pregnancy,and an oral glucose tolerance test(OGTT)with 75g of glucose was performed at 24~28 weeks of pregnancy according to the GDM diagnostic criteria in the Guidelines for Gestational Diabetes Mellitus(2014).General demographic data were collected from Electronic Health Records Information System and Maternal and Child Health Care Services.The dietary intake was collected by dietary APP and the semi-quantitative food frequency method,and blood biochemical index data was collected by the hospital inspection system.Logistic regression analysis was performed to screen potential predictive factors and build a predictive model.The area under the receiver operating characteristic(ROC)curve(AUC),clinical decision curve analysis(DAC),and other indicators were used to evaluate the performance of the model.Restricted cubic spline(RCS)regression analysis was performed to determine the risk predictive value of the model.The performance of five machine learning algorithm models was compared to explore the best GDM prediction model modeling method.Result1.GDM occurrence and GDM predictors in early pregnancy.The prevalence of GDM in the survey respondents was 22.92%.The results of multivariate logistic regression analysis showed that early pregnancy systolic blood pressure>112 mm Hg(OR[95%CI]:2.40[1.06~5.65]),early pregnancy weight>53.5 kg(OR[95%CI]:2.30[1.06~5.16]),fasting blood glucose level of 4.85 mmol/L(OR[95%CI])L(OR[95%CI]:2.68[1.29~5.65]),level of aspartate aminotransferase>13.00 U/L(OR[95%CI]:2.60[1.26~5.48]),and total meal frequency>19times/week(OR[95])were risk factors of GDM.In addition,height of pregnant women>156 cm(OR[95%CI]:0.248[0.11~0.56]),distribution width of platelets in early pregnancy>9.70 f L(OR[95%CI]:0.33[0.15~0.70]),thyrotropin levels>24.9 U/m L:0.46[0.22~0.95]),and consumption of yogurt>4 times/week(OR[95%CI]:0.46[0.21~0.99])was the predictor of GDM protection(P<0.05).2.Performance comparison of GDM prediction models.The AUC(95%CI)of Model1was 0.66(0.64~0.68),Model2was 0.66(0.64~0.68),Model3was 0.76(0.70~0.83),and Model4was 0.85(0.80~0.91);the predicted intercept values of high-risk population were Model4>0.12,Model3>0.17,Model2>0.19,and Model1>0.21.Results of calibration curve analysis showed that the predicted curves of Model1and Model2had a high degree of coincidence with the actual and ideal curves and that the predicted curves of Model3and Model4slightly deviated from the actual and ideal curves.DCA analysis showed that the comparison among the net benefits of four GDM prediction models was as follows:Model4>Model3>Model2>Model1.3.Comparison of the performance of GDM prediction models constructed by five machine learning algorithms.Analysis of the GDM prediction model constructed by five different algorithms showed that the best AUC performance both random forest method and gradient lifting tree method was 0.99,with verification set AUC was 0.76(0.59~0.93)and 0.79(0.64~0.94),respectively.The AUC of the logistic regression analysis training set was 0.85(0.79~0.91)and that of the validation set was 0.86(0.75~0.96).Furthermore,the AUC of the neural network method was 0.73(0.67~0.79)and that of the validation set was 0.75(0.63~0.86).The study found that logistic regression analysis and support vector machine prediction models outperformed random forest and gradient boosting tree prediction models in terms of net gain.ConclusionThis study found that the general demographic data,dietary intake,and blood biochemical indices in early pregnancy were independent predictors of GDM,which were closely related to the occurrence and development of GDM.The AUC value of the jointly constructed GDM prediction model was the highest,and the random forest and gradient lifting tree methods were over-fitted.Moreover,the logistic regression analysis prediction model was more stable,with higher accuracy in predicting GDM occurrence than the random forest and gradient boosting tree prediction models.The above results reveal the potential value of biochemical indicators and dietary factors in the prediction of GDM in the first trimester,provide new ideas and methods for building GDM prediction models with higher predictive ability and more stable performance,and provide guidance for identifying high-risk groups and intervention measures for GDM in the first trimester. |