| Part one The association of clinicopathological features and CT features with spread through air space(STAS)in lung adenocarcinomaPurpose To investigate the association of clinicopathological features and CT features with spread through air space(STAS)in lung adenocarcinoma.Materials and Methods Clinical pathological data and preoperative chest CT imaging data of709 patients who had lung adenocarcinoma confirmed by surgical pathology and underwent CT examination within three months before surgery were retrospectively analyzed.Mann-Whitney U rank sum test,χ~2 test or t test were used to analyze the association of clinicopathological features and CT features with STAS status.Those clinicopathological features and CT features which were correlated to STAS with statistical significance at univariate analysis were introduced into multivariate logistic regression(LR)analysis to identify independent risk factors.P<0.05 considered to be statistically significant.Results Of 709 patients,there were 301 males and 408 females,with the age range of 20-86years and the average age of 59.7±12.4 years.Among them,131 cases(18.5%)were STAS positive(STAS positive group)and 578 cases(81.5%)were STAS negative(STAS negative group).Univariate analysis showed that 5 clinical features,5 pathological features,and 12 CT features were associated with STAS status with statistical significance.The multivariate LR analysis revealed that 4 clinical features(Gender,T stage,N stage,CEA),3 pathological features(histological subtype,lymphatic infiltration,bronchial resection margin)and 3 CT features(nodule type,maximum tumor diameter and PSC)were independent risk factors of STAS.Conclusion Some preoperative CT features were related with lung adenocarcinoma,which might be used as imaging bio-markers to predict the STAS status of lung adenocarcinoma.Part two CT-based radiomics machine learning model to predict spread through air space in lung adenocarcinomaPurpose To investigate the value of CT-based radiomics machine learning model to predict spread through air space in lung adenocarcinoma.Materials and Methods This retrospective patient’s cohort consisted of 709 patients from Shenzhen People’ s Hospital(center Ⅰ)and 120 patients from Shenzhen Cancer Hospital of Chinese Medical Academy(center Ⅱ).All the patients had pathological proven lung adenocarcinomas and underwent CT examinations within 3-month before operation.The 709 patients from center Ⅰ were divided into a training(n=496)cohort and a validation cohort(n=213)with a ratio of 0.7: 0.3 randomly.The patients from center Ⅱ were used as the external test cohort.Clinical and CT features of patients in these two centers were retrospectively analyzed.Radiomic features were extracted through semi-automatic sketch three-dimensional volumes of interest(3D-VOI)on thin-slice CT images.Pearson correlation analysis was used to eliminate the features with high repeatability(r>0.8).The Mann-Whitney U test was used to compare the differences in radiomic features between STAS positive and negative,and multivariate LR was used to identify the independent risk factors for STAS.Using random forest machine learning algorithm,four models(model I [radiomic features],model II [CT features],model III [clinical features],and model IV [combination of radiomic features,CT features and clinical features])were developed in the training cohort and validated in the validation cohort.And then the procession was repeated 101 times.The area under the ROC curve(AUC value)was used to evaluate the predictive performance.The Kruskal-Wallis H rank test,followed by the Nemenyi method,were used to compare the four models’ prediction performance.Then the prediction performance of the best model was evaluated in the external test cohort.P<0.05 considered to be statistically significant.Results Among the 107 radiomic features,univariate analysis showed that 57 radiomic features were statistically different between the STAS positive group and the STAS negative group.Multivariate LR analysis revealed that 3 radiomic features(Imc1,Median and Gray Level Non Uniformity Normalized)were independent risk factors for the STAS status of lung adenocarcinoma.The median AUC values of the model I to IV are 0.795(IQR: 0.777~0.810),0.799(IQR: 0.780~0.817),0.610(IQR: 0.582~0.631)and 0.811(IQR: 0.797~0.830),respectively.The difference between AUC values of the four models was statistically significant(χ2=245.00,P<0.001).Nemenyi method showed that only the difference between model Ⅰ and model Ⅱ was not statistically significant,and the differences between the other models were statistically significant.The model IV(the combined model of radiomics,CT and clinical features)was confirmed to be the best model,and obtained an AUC of 0.88(95%CI: 0.79-0.97)for predicting STAS in the external test cohort with a sensitivity of 84.2% and a specificity of 81.2%.Conclusion CT-based radiomics machine learning model can achieve high diagnostic efficiency of predict the STAS status in lung adenocarcinoma before surgery,and it can provide decision support for the choice of surgery pattern. |