Objective:The social demographic characteristics,imaging data,tumor markers and AI risk score of benign and malignant pulmonary nodules(hereinafter referred to as AI risk score)of patients with pulmonary nodules(PN)were retrospectively analyzed to explore the independent risk factors of malignant pulmonary nodules.Based on the above risk factors,the risk prediction model of benign and malignant pulmonary nodules was developed,and the model evaluation and verification were completed.Methods:This study is a retrospective analysis,and a total of 610 patients who underwent biopsy or surgical treatment for pulmonary nodules in the First Affiliated Hospital of Kunming Medical University during May 2021 to August 2022 and finally had pathological examination were selected.The demographic characteristics of patients were collected,including gender,age,family history of malignant tumor,smoking history,imaging data(included the maximum diameter of nodule,nodule location,nodule density(partial solid nodule,pure ground-glass nodule and solid nodule)),special imaging findings(vacuole sign,lobulation sign,vascular convergence sign and spiculation sign),AI risk score and serum tumor marker results.The study subjects were divided into the training set and the validation set by random sampling method according to 6.5:3.5.Among 397 subjects in the training set and 213 subjects in the validation set,no statistical significance between gender and benign or malignant differences.Univariate analysis was performed on the training set data:n(%)was used to describe the qualitative data,andχ~2test and Fisher’s exact probability method were used for comparison between groups.Delong test was used to compare the area under different ROC curves.For the continuous variables,include age,CEA,CYFRA21,NSE,Pro GRP,SCC,CA125,CA153,CA199,maximum diameter of nodules and AI risk score,were analyzed by optimal scale regression.For the data in the training set,variables with statistical significance and professionally considered clinical significance after single factor screening were included in multivariate analysis,and binary logistic regression was fitted.The risk prediction model for benign and malignant pulmonary nodules was constructed by stepwise method,and the nomogram was drawn to visualize the results of logistic regression.Then,the verification set is used to verify the model constructed above:ROC curve is used to evaluate the discrimination of the model,calibration curve is drawn to evaluate the calibration of the model,and decision curve is used to evaluate the applicability of the model.In the subgroup analysis,subgroup analysis was performed according to gender(male or female),age(<45 years old or≥45 years old),nodule location(upper lobe or non-upper lobe),and benign nodules(adenocarcinoma in situ or other benign).At the same time,the data in the validation set are brought into the Mayo model and the VA model and compared with the model of this study.The patient information was collected from the First Affiliated Hospital of Kunming Medical University during September 2022 to November 2022,who underwent surgical treatment due to pulmonary nodules and finally had pathological examination,including 36 benign nodules and 84 malignant nodules.Results:1.In univariate analysis,CEA,NSE,Pro GRP,CA125,CA199,AI risk score,maximum diameter of nodule,lobulation sign,spiculation sign,vascular convergence sign,vacuole sign and nodule density were statistically different(P<0.05).2.The results of binary logistic regression analysis showed that CEA,Pro GRP,CA125,AI risk score,spiculation sign,lobulation sign,partial solid nodule and pure ground glass nodule were the independent risk factors for malignant nodules,and solid nodule was a protective factor for malignant tumors.3..The regression equation of the clinical prediction model is:Logit(P)=0.9×CEA+1.00×Pro GRP+1.12×CA125+2.15×AIriskscore+1.44×spiculation+1.37×lobulation+1.85×p GGN+2.46×PSN.(Burr sign,lobulation sign is negative assignment is 0,positive assignment is 1;nodule density:solid nodules(SN)assignment:0,pure ground glass nodules(p GGN)assignment:1,partial solid nodules(PSN)assignment:2;CEA≤2.50 is assigned as 1,CEA>2.50is assigned as 2;Pro GRP≤48.00 is assigned as 1,Pro GRP>48.00 is assigned as2;CA125≤14.50 is assigned as 1,CA125>14.50 is assigned as 2;AI Risk Score(%)≤70 is assigned as 1,AI risk score(%)>70 is assigned as 2).4.Discrimination evaluation of training set model:The area under the ROC curve of the clinical prediction model AUC=0.896,95%CI:0.861-0.930,suggesting that the model has a good discrimination,with a sensitivity of 86.01%,a specificity of 79.28%,and an accuracy of 84.13%.5.Internal validation results of the model:Area under the ROC curve of clinical prediction model AUC is 0.856(95%CI:0.799-0.912),indicating that the model has good discrimination.The calibration curve indicates that the constructed model is close to the ideal model.The model Hosmer-Lemesho(H-L)test(H-L=8.61,P=0.372)indicates that the model has good goodness of fit,with sensitivity of 89.73%,specificity of 68.66%and accuracy of 83.10%.6.In the internal validation cohort of the model,the tumor marker AUC=0.670,95%CI:0.596-0.745,the imaging data AUC=0.731,95%CI:0.664-0.798,and the AI risk score AUC=0.747,95%CI:0.682-0.811.The diagnostic efficacy of the above single clinical indicator was lower than that of the developed prediction model(P<0.001).7.In model internal validation cohort,the clinical prediction model AUC is0.856(95%CI:0.799-0.912),with an accuracy of 83.10%,V/A model AUC is 0.606(95%CI:0.526-0.685),accuracy of 60.38%,Mayo model AUC is 0.689(95%CI:0.613-0.767),with an accuracy of 66.98%,indicating that this clinical prediction model is more accurate than V/A model and Mayo model.8.Decision curve analysis was carried using data from the validation sets.The net benefit within the threshold range of 0-100%is higher than that of the extreme curve,which indicates that the model has good clinical application value.9.Subgroup analysis of gender,whether the nodule is located in the upper lobe and whether the age was greater than 45 years showed that the area under the ROC curve of the clinical prediction model was high and similar,indicating that the model had good discrimination for different genders,different lung lobes and different age groups.10.The subjects whose pathological results were benign nodules were divided into adenocarcinoma in situ subgroup and other benign subgroups for subgroup analysis:The area under the ROC curve of the clinical prediction model was high and similar,indicating that the model in the adenocarcinoma in situ subgroup and other benign subgroups.There was no meaningful change in the discrimination of the model,and the results were consistent with the 2021 WHO pathological classification,which defined adenocarcinoma in situ of the lung as a the gland.11.Out-of-model test cohort:area under the ROC curve AUC is 0.891,95%CI is around 0.849-0.933,Sensitivity is 80.95%,specificity is 86.11%,indicating that the model has good discrimination,and the best cut-off value of prediction probability is0.678.If the prediction probability of the model is≥0.678,it indicates the risk of lung cancer,and clinical intervention is recommended.Conclusions:1.CEA,Pro GRP,CA125,AI risk score,spiculation sign,lobulation sign,partial solid nodule and pure ground-glass nodule are independent risk factors for evaluating malignant pulmonary nodules.This risk prediction model has good accuracy,sensitivity and specificity,which can better identify the malignant risk of pulmonary nodules and provide certain basis for clinicians in the diagnosis and treatment of the uncertain pulmonary nodules.2.The model has good clinical predictive value for different gender,different lung lobe and different age groups.3.There was no significant change in the discrimination of the model in the adenocarcinoma in situ subgroup and other benign subgroups,which was in line with the 2021 WHO pathological classification of lung adenocarcinoma in situ as gland precursor lesions.4.The cut-off value for the diagnosis of benign and malignant nodules in the external validation set was 0.678.When the diagnostic value(P)of pulmonary nodules was≥0.678,the probability of malignancy was higher.When the diagnostic value is<0.678,the probability of benign disease is higher. |