| Background:ARDS has a high incidence and mortality rate in the ICU,with mortality being correlated with the severity of the oxygenation index.Timely and accurate prediction of critically ill ARDS patients is crucial for them to receive appropriate treatment and care early on,which may improve their prognosis.Objective1.Explore the application of machine learning techniques to establish artificial intelligence algorithm models for early(4/8/12 hours in advance)recognition of deteriorating ARDS conditions,and validate the performance of the models.2.Apply machine learning techniques to analyze the risk factors related to the deterioration of ARDS during hospitalization.Methods(1)Based on the AI diagnostic model established from previous work and using clinical data of 1278 ARDS patients from the MIMIC III database,clinical indicators were extracted using Postgre SQL 14 software according to previous studies on prognostic factors of ARDS and the clinical experience of physicians.(2)According to different prediction time intervals,the data is divided into three groups: a 4-hour-ahead group,an 8-hour-ahead group,and a 12-hour-ahead group.Based on the criterion of the progression of oxygenation index during the patient’s hospitalization from 150≤P/F≤300 mm Hg to less than 150 mm Hg,the time points of ARDS patient deterioration are screened,and the required time window range is determined.The general information,laboratory indicators,ventilator parameters,scoring sheet scores,and continuous physiological indicators of the cases within the required time window were extracted using STATA 17 software.Trend variables for continuous physiological indicators were also calculated.The obtained data was then analyzed using single-factor logistic regression analysis in RStudio to compare the differences between the deterioration group and the control group.(3)The Lasso crossvalidation method was used to select variables with significant predictive value for ARDS deterioration and include them in the model.(4)All data in each group were randomly divided into 80% training set and 20% testing set.XGBoost algorithm,random forest algorithm,and support vector machine algorithm were used to construct the model,and the best combination of algorithm model hyperparameters was selected using 10-fold crossvalidation or exhaustive method on the training set.(5)The training data was used for parameter optimization and algorithm model training,while the testing data was used to evaluate the predictive performance of the algorithm model using metrics such as precision,sensitivity,accuracy,specificity,and the area under the ROC curve.(6)The SHAP analysis was used to rank the importance of variables according to the mean SHAP value of each variable in the XGBoost model.Result1.In the group of 4 hours ahead,1189 cases of deterioration and 904 cases of control were included;in the group of 8 hours ahead,1164 cases of deterioration and 893 cases of control were included;in the group of 12 hours ahead,1136 cases of deterioration and 888 cases of control were included.A total of 42 indicators were included in the study.The results of the univariate analysis showed no statistically significant differences(P>0.05)in the gender and comorbidities such as chronic kidney disease and coronary heart disease between the deterioration group and the control group.However,some laboratory indicators such as SOFA score,SAPS II score,the latest values of ventilator plateau pressure,creatinine,lactate and p H,as well as vital signs including body temperature,pulse oximetry saturation,respiratory rate,mean arterial pressure,and heart rate showed significant differences between the latest value and the mean value of the past 6 hours(P<0.05).2.Through Lasso regression analysis,13 variables(4-hour group),14 variables(8-hour group),and 14 variables(12-hour group)were respectively selected from the 42 indicators to construct the algorithmic models.3.The predictive performance of the algorithm model on the test set data shows that compared to the random forest algorithm and support vector machine algorithm,the XGBoost algorithm model has the best predictive performance,with AUC curve areas of0.888(4-hour ahead prediction group;95%CI: 0.859-0.917),0.821(8-hour ahead prediction group;95%CI: 0.783-0.860),and 0.801(12-hour ahead prediction group;95%CI:0.7597-0.8425).4.Variable importance rankings are as follows:(1)For the 4-hour-ahead group,the variable importance is as follows: lactate latest value(Lac),mean heart rate within 6 hours(HR_m),latest value of ventilator plateau pressure(Pplat),mean pulse oxygen saturation within 6 hours(Sp O2_m),mean value of the mean arterial pressure within 6 hours(MAP_m),et.(2)For the 8-hour-ahead group,the variable importance is as follows: lactate latest value(Lac),mean value of the mean arterial pressure within 6 hours(MAP_m),standard deviation of heart rate within 6 hours(HR_s),latest value of pulse oxygen saturation(Sp O2_v),mean heart rate within 6 hours(HR_m),et.(3)For the 12-hour-ahead group,the variable importance is as follows: lactate latest value(Lac),latest value of ventilator plateau pressure(Pplat),mean arterial pressure within 6 hours(MAP_m),standard deviation of heart rate within 6 hours(HR_s),latest value of pulse oxygen saturation(Sp O2_m),et.Conclusion1.It is feasible to establish an ARDS deterioration prediction model using machine learning techniques.The model built using XGBoost algorithm can effectively predict the occurrence of ARDS patient deterioration.2.When continuously predicting the condition of ARDS patients,the main risk factors related to the deterioration of ARDS are: the latest lactate value,the latest plateau pressure value,the mean arterial pressure within 6 hours,the mean pulse oxygen saturation within6 hours,and the heart rate variability within 6 hours. |