Font Size: a A A

Prediction Models For Endometrial Cancer Ovarian Metastasis Based On Clinicopathological Factors

Posted on:2022-09-18Degree:MasterType:Thesis
Country:ChinaCandidate:X D LiuFull Text:PDF
GTID:2504306314958549Subject:Obstetrics and gynecology
Abstract/Summary:PDF Full Text Request
BackgroundEndometrial cancer(EC),originating in the endothelium of the uterus,is the second most common malignancy of the reproductive system and the sixth most common malignancy in women[1].More than 80%of endometrial cancers occur in women older than 50 years,with a median age of 63[2].With the improvement of living standard,especially the increase of obese people,the age of EC onset is getting younger[3].How to treat diseases while simultaneously improving the quality of life of young patients is critical for clinicians.Currently,the standard primary treatment is based on complete staging surgery,including at least total hysterectomy(TH)with bilateral salpingo-ovariectomy(BSO),and(or)lymph nodes evaluation.Postoperative surgery adjuvant therapy,such as radiotherapy and chemotherapy,depends on high risk factors of EC[4].However,ovariectomy not only leads to surgical menopause in the wake of perimenopausal symptoms,for example,hot flushes and osteoporosis et al,but greatly increases the all-cause mortality of cardiovascular disease[5,6].Therefore,bilateral ovary sparing staging surgery may be considered in young patients with early EC.Previous literature suggests that deep myometrial infiltration,lymph vascular space infiltration(LVSI),non-endometrioid histology,lymph node metastasis,and histological grade are high risk factors for ovarian metastasis in EC.There is no study to quantitatively evaluate the correlation between various factors and ovarian metastasis so far.It’s of great clinical value to establish a prediction model for ovarian metastasis to clarify the weight of each risk factor and guide the surgical methods.Purpose1.To explore the prevalence and risk factors for EC ovarian metastasis.2.To construct generalized linear model of EC ovarian metastasis and realize the visualization of the model to guide clinical decision-making;3.To develop support vector machine models of EC ovarian metastasis to improve the prediction models performance,and achieve comparison between and within models.Methods1.Patients and research data1.1 Identifying research objects:This is a retrospective case-control research.From January 2010 to December 2018,patients who have received surgical treatment for EC in the Department of Obstetrics and Gynecology,Qilu Hospital of Shandong University were enrolled.Inclusion criterion:(1)patients who were diagnosed as EC by preoperative curettage or hysteroscopy;(2)patients who received surgery at least including TH with BSO.Patients who were misdiagnosed as EC by postoperative pathology,having received adjuvant chemoradiotherapy,complicated with serious physical or mental disorders and without complete medical records were excluded from the research.Patients were divided into training and test groups at a ratio of 7:3 randomly.1.2 Selecting research variables:Patients’ basic clinical information included age,menstrual history(early menarche;late menopause),pregnancy-labor history(infertility;later labor age),past history(hypertension;diabetes mellitus or hyperinsulinemia;concomitant primary malignancy of other organs),family history.Immunohistochemical factors included estrogen receptor(ER),progesterone receptor(PR)and Ki67 expression.Postoperative pathological factors include depth of myometrial invasion(≥ 50%;<50%),LVSI,cervical involvement,pathological type(endometroid;non-endometroid),histological grade(Gl,G2,G3),tumor diameter,vaginal metastasis and oviduct involvement.1.3 Follow up:Patients come to visit doctors every 3-6 months for 2-3 years after surgery,every half a year 3 years later and once a year 5 years later.The review included medical history inquiry,physical examination,laboratory examination and imaging examination.2.To explore risk factors of EC ovarian metastasis2.1 Patients’ clinicopathological factors:Median,minimum and maximum were used to describe continuous data,number and percentage for categorical variable.We compared categorical variable by chi-square test between training and validation groups.Shapiro tests was used for normality test.Normal distribution data was compared between groups by one-way analysis of variance and Wilcox nonparametric test was used for non-normal data comparison.P<0.05 was identified statistically significant.2.2 Univariate analysis:Univariate Logistic analysis was performed to explore the correlation between baseline characteristics,immunohistochemical index and postoperative pathological factors and ovarian metastasis.3.Generalized linear model3.1 Features selection:Lasso regression[9]was performed on statistically significant factors confirmed by univariate analysis for features selection.3.2 Model establishment and visualization:Features selected by Lasso regression were included for establishing multivariate Logistic regression model and visualized by nomogram,which assigned quantitative scores to risk factors in the model,then summed up to calculate predicted ovarian metastasis possibility of each patient.Risk threshold was calculated based on Youden’s index.3.3 Model validation and evaluation:Based on cross-validation[11]and bootstrapping method[12],the training and validation sets were used to achieve internal and external validation and evaluation form three aspects.Discrimination,represents the ability of the prediction model to distinguish outcome variables correctly measured by AUC and ROC curve[13].Calibration,that is to say,means the consistency between the predicted value by model and the real value evaluated by the Hosmer-Lemeshow test and calibration curve[14].Clinical utility is shown by decision analysis curve(DCA)[15].3.4 Subgroup analysis:Patients with stage I,high grade,endometroid type carcinoma were excluded from total cohorts and nomogram prediction model was validated in this group.4.Support vector machine(SVM)model4.1 Data preprocessing:We converted factor variables into multiple binary variables for following data analysis.4.2 Model training and parameters tuning:SVM models were trained including statistically significant variables upon univariate Logistic analysis.In this study,four kinds of SVM models with different kernel functions were trained by R e1071 package[16]:(1)linear kernel,of which cost was set as 0.001,0.01,0.1,1 and 5;(2)polynomial kernel,of which degree was set as 3,4,and 5;(3)radial kernel,of which gamma was set as 0.1,0.5,1,2,3 and 4;(4)sigmoid kernel,of which gamma was set as 0.1,0.5,1,2,3 and 4 and parameter "coef0" as 0.1,0.5,1,2,3.Parameter’s tuning was achieved by 10-fold cross validation.4.3 Model evaluation:We drew ROC curve by pROC package and calculated AUC for comparation between and within models.5.Statistical analysis methods:All statistical analysis was completed by R studio 4.0.3.P<0.05 was considered statistically significant.The R packages used in the research were as follows:glmnet,pROC,Hmisc,rms,caret,e1071,kernlab and rmda.Results1.Patient’s clinicopathological characteristics1.1 Inclusion and exclusion criterion:We totally collected 2013 patients’ data from January,2010 to December,2018,of which 32 had received presurgical chemoradiotherapy,190 without complete surgery,10 with inconsistent pathology and 1078 with incomplete records.Finally,703 patients were included for research.We randomly split above patients into training(n=493)and validation group(n=210)at a ratio of 7:3.1.2 Training group1.2.1 Baseline Information:There were 348(70.6%)patients younger than 60,91(18.5%)patients with hypertension or diabetes mellites,23(4.7%)patients with primary malignant tumors of other sites.The median age of menarche and menopause was respectively 15 years(range:9-20 years)and 50 years(range:27-64 years)old.1.2.2 Pathological variables:There were 114(23.1%)patients with deep myometrial invasion,42(8.5%)patients with LVSI,68(13.8%)with cervical invasion,7(1.4%)with oviduct metastasis and 14(2.8%)patients with ovarian metastasis.There were 405(82.2%),35(7.1%),50(10.1%),3(0.6%)patients respectively in the Ⅰ,Ⅱ,Ⅲ,Ⅳ stage.Median tumor diameter was 3.20(range:0.30-20.0)cm.1.2.3 Immunohistochemical characteristics:The median of ER proportion index and staining intensity was respectively 4(range:0-5)and 2(range:0-3).The median of PR proportion index and staining intensity was respectively 4(range:0-5)and 2(range:0-3).The median of Ki67 proportion index was 1(range:0-1).1.3 Validation group1.3.1 Baseline Information:There were 157(74.8%)patients younger than 60,39(18.6%)patients with hypertension or diabetes mellites,5(2.4%)patients with primary malignant tumors of other sites.The median age of menarche and menopause was respectively 15 years(range:11-20 years)and 50 years(range:28-56 years)old.1.3.2 Pathological variables:There were 41(19.5%)patients with deep myometrial invasion,26(12.4%)patients with LVSI,30(14.3%)with cervical invasion,2(1.0%)with oviduct metastasis and 6(2.9%)patients with ovarian metastasis.There were 171(81.4%),19(9.0%),17(8.1%),3(1.4%)patients respectively in the Ⅰ,Ⅱ,Ⅲ,Ⅳ stage.Median tumor diameter was 3.00(range:0.20-13.0)cm.1.3.3 Immunohistochemical characteristics:The median of ER proportion index and staining intensity was respectively 4(range:0-5)and 2(range:0-3).The median of PR proportion index and staining intensity was respectively 2(range:0-5)and 2(range:0-3).The median of Ki67 proportion index was 1(range:0-1).2.Univariate analysis:Concomitant primary malignancy of other organs(Odds Ratio,OR:9.68,95%CI:2.48-30.01,P<0.001),family history(OR:3.54,95%CI:1.06-10.35,P=0.095),tumor diameter(OR:1.37,95%CI:1.19-1.60,P<0.001),deep myometrial invasion(OR:3,48,95%CI:1.17-10.37,P<0.022),LVSI(OR:3.08,95%CI:0.68-10.35,P=0.095),cervical involvement(OR:3.67,95%CI:1.10-10.98,P<0.022),oviduct involvement(OR:31.91,95%CI:9.20-108.85,P<0.001),ER proportion index(OR:0.76,95%CI:0.56-1.04,P=0.081)and PR proportion index(OR:0.77,95%CI:0.57-1.05,P=0.086)were closely related with ovarian metastasis in EC.3.Generalized linear model3.1 Features selection:Upon Lasso regression,five variables including family history,concomitant primary malignancy of other organs,tumor diameter,oviduct involvement and ER proportion index were included for model establishment.3.2 Model establishment and visualization:Multivariable Logistic analysis showed that concomitant primary malignancy of other organs(OR:8.61,95%CI:1.80-37.04,P<0.001),family history(OR:5.13,95%CI:1.29-20.29,P<0.001),tumor diameter(OR:1.25,95%CI:1.07-1.50,P<0.001)and oviduct involvement(OR:16.95,95%CI:3.73-76.25,P<0.001)increased the risk of ovarian metastasis.ER proportion index was negative with ovarian metastasis(OR:0.81,95%CI:0.57-1.15,P<0.001).The GLM model was visualized by nomogram.3.3 Model validation and evaluation3.3.1 Internal validation:The nomogram model has AUC of 0.86(95%CI:0.72-1.00)in the training cohort.The calibration curve didn’t show good agreement between the predicted values and actual values(U index:-0.004,Brier score:0.018).Hosmer-Lemeshow test couldn’t reject the null hypothesis which represents a good fit(P=0.34).3.3.2 External validation:In the test group,the AUC of nomogram model was 0.84(95%CI:0.60-1.00).Calibration curve showed U index was-0.005 and Brier score was 0.027.The P value of Hosmer-Lemeshow test was 0.21,which means the null hypothesis couldn’t be rejected.In total,the nomogram prediction model didn’t behave well as well as in the training cohort.3.3.3 Clinical benefits assessment:According to DCA curves of training and test groups,we could acquire good clinical benefits using nomogram model than no any intervention measures.What’s more,the benefits of training cohort were greater than test cohort.3.3.4 Risk threshold of GLM model:Based on Youden’s index,the risk threshold of GLM model was 0.037.According to risk threshold,we classified patients into high-risk and low-risk group.In the training cohort,58 patients were identified as high risk and 435 patients as low risk.Accordingly,the ovarian metastasis rate was 11.8%.The AUC of risk threshold prediction model was 0.81.In the test group,22 patients were classified into high-risk and 188 into low-risk group.The ovarian metastasis rate was 16.1%.4.Subgroup analysis:The AUC of GLM model was 0.88(95%CI:0.78-0.98)in this group.Calibration curve showed U index was-0.003 and Brier score was 0.029.Hosmer-Lemeshow test couldn’t reject the null hypothesis,of which P value was 0.14.Meanwhile,GLM model could achieve great clinical benefits in the group.When the risk threshold was 0.037,the AUC was 0.82.The predicted ovarian metastasis rate was 16.1%of the group,in which 71 patients were divided in the high-risk group and 329 in the low-risk group.5.Support Vector Machine(SVM)model4.1 Linear SVM model:The best parameter of cost was 0.001 in linear SVM,the number of which support vectors was 39.The AUC was 0.79(95%CI:0.62-0.95)in training group and 0.75(95%CI:0.50-1.00)in test group.4.2 Multinomial SVM model:The best parameters of multinomial model were as follows:degree of 3,coef0 of 2 and cost of 1,the amount of which support vectors was 94.The AUC was 0.92(95%CI:0.82-1.00)in training group and 0.72(95%CI:0.42-1.00)in test group.4.3 Radial SVM model:The best combined parameters were shown as follows:gamma of 0.5 and cost of 2.1.The number of support vectors of radial SVM model were 178,of which AUC was 1.00(95%CI:1.00-100)in the training group and 0.85(95%CI:0.72-0.98)in the test group.4.4 Sigmoid SVM model:The best combined parameters of sigmoid SVM model were as follows:gamma of 0.1 and coef0 of 1.The number of support vectors of sigmoid SVM model were 28,of which AUC was 0.59(95%CI:0.46-0.71)in the training group and 0.60(95%CI:0.47-0.73)in the test group.Conclusion1.Upon univariate analysis,concomitant primary malignancy of other organs,family history,tumor diameter,deep myometrial invasion,LVSI,oviduct involvement and oviduct involvement were positive with ovarian metastasis,whereas,ER and PR proportion index had a negative correlation with ovarian metastasis in EC,2.Nomogram model confirmed that family history,concomitant primary malignancy of other organs,greater tumor diameter and oviduct involvement increased the possibility of EC ovarian metastasis.However,ER proportion index decreased the possibility of EC ovarian metastasis.3.Nomogram model has a good discrimination,calibration and clinical utility upon internal and external validation.4.Radial kernel has a better prediction ability than linear,multinomial and sigmoid kernels.Furthermore,comparison between models shows radial SVM model has a larger AUC than GLM model.By machine learning algorithms,prediction models could be optimized.Innovations and LimitationsInnovations1.This is the first study to establish a prediction model for EC ovarian metastasis,which provides a practical tool for clinical decision-making and has great clinical significance in improving the quality of life for EC patients.2.The research explores the risk factors for EC ovarian metastasis from not only clinical baseline characteristics and postoperative factors but immunohistochemical indexes.3.Machine learning algorithms is applied to gynecological oncology,building GLM and SVM model and achieving comparations and validation within and between models.Nomogram was used for GLM visualization and evaluated from discrimination,calibration and clinical utility three aspects.Limitations1.The research is based on one single institution with limited outcome events and missing values,contributing to selection bias.External validation is indispensable for evaluate the nomogram model’s practicability.The performances of GLM model could be improved if more variables included.2.The SVM models failed to realize the model visualization.The model interpretation is difficult,and the practicability is not good.
Keywords/Search Tags:Endometrial cancer, Ovarian metastasis, GLM, SVM, Nomogram
PDF Full Text Request
Related items