Background:Lung cancer remains the primary cause of deaths relating cancer all over the world.Part of pulmonary nodules represent potentially curable cancer in early stage.There is no specific biomarker in the diagnosis of pulmonary nodules.Systemic inflammation is of much concern in carcinogenesis,progression and prognosis of lung cancer.Research on the effects of inflammatory markers in distinguish malignant from benign pulmonary nodules is limited.Objective:To explore predictors and establish model for the prediction of malignancy in pulmonary nodules based on routine blood test,CT and clinical features by using different kinds of methods to select variables.Methods:Data of patients with pulmonary nodules who received pathological findings in Zhujiang Hospital of Southern Medical University from January 1,2016 to September 30,2022,were retrospectively collected,and divided into training cohort and validation cohort according to the time of visit.Based on clinical and imaging characteristics of patients,variables screened by least absolute shrinkage and selection operator(LASSO)regression were used to select predictors to establish malignant pulmonary nodules prediction model,Model1.Combined with blood routine test data,the potential predictors of the prediction models,Model2 and Model3 were selected by LASSO regression and traditional methods——univariate and multivariate analyses.To evaluate the performance of the models,receiver operating characteristic curve(ROC),area under ROC curve(AUC),Brier Score,Calibration curve,Hosmer-Lemeshow goodness of fit test(H-L test),decision curve analysis(DCA)curve and integrated discrimination improvement(IDI)were performed.External verification was performed in validation cohort.The optimal prediction model was selected and a nomogram was drawn to visualize the model.Results:A total of 459 patients with pulmonary nodules were enrolled in the training set,267 of which were with malignant pulmonary nodules.A total of 121 patients with pulmonary nodules were included in the verification set,including 79 patients with malignant pulmonary nodules.The mean AUC of Model1,Model2 and Model3 in the training cohort were 0.76(95%CI:0.72-0.81),0.79(95%CI:0.74-0.83)and 0.79(95%CI:0.75-0.83).The Brier scores of all models were less than 0.25 and the P value of H-L test were greater than 0.05.Calibration curves demonstrated acceptable model calibration,with good agreement between the observed frequency and predicted probability of patients with malignant pulmonary nodules in both datasets,and all three models showed clinical net benefit at the threshold of 15%-80%.In external validation,the AUC of Modell,Model2 and Model3 were 0.77(95%CI:0.68-0.86),0.74(95%CI:0.65-0.83)and 0.74(95%CI:0.64-0.83).The Brier scores were smaller than 0.25 and the calibration curves suggested good model fitting,The models all showed clinical net benefit in the range of 10%-85%threshold.Compared with Model1,the predictive accuracy of Model2 was improved by 3.4%(IDI=0.034,95%CI:0.018-0.050),P<0.001),and the predictive accuracy of Model3 was improved by 3.9%(IDI=0.039,95%CI:0.021-0.056,P<0.001).Model2 is selected as the optimal model.9 variables were independent predictors and were included in the prediction model:age(OR=1.045,95%CI:1.025-1.067),male(OR=0.577,95%CI:0.366-0.903),diameter(OR=1.064,95%CI;1.024-1.107),mixed ground-glass nodule(OR=2.676,95%CI:1.59-4.579),pure ground-glass nodule(OR=3.799,95%CI:2.019-7.336),spiculation(OR=1.694,95%CI:1.070-2.687),lobulation(OR=2.116,95%CI:1.253-3.603),satellite lesions(OR=0.31,95%CI:0.120-0.753),mean corpuscular volume(MCV)(OR=0.016,95%CI:1.007-1.071),platelet(PLT)(OR=1.007,95%CI=1.003-1.011).Nomogram was drawn to visualize the model.ConclusionsLASSO regression can screen variables more effectively and accurately.Age,male,diameter,density,spiculation,lobulation,satellite lesions,MCV and PLT were independent predictors of malignancy in patients with pulmonary nodules.Our prediction model of malignant pulmonary nodules has good performance. |