Font Size: a A A

Research On The Prognosis Of ER+and ER-Breast Cancer

Posted on:2019-02-05Degree:MasterType:Thesis
Country:ChinaCandidate:L WangFull Text:PDF
GTID:2404330548970804Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
Breast cancer seriously endangers the health of women and has a high incidence in female malignant tumors.The occurrence and development of breast cancer are more complicated.At present,the pathogenesis of breast cancer is not completely clear,and there are many disputes.Traditional pathological examination methods such as physical examination,mammography or CT,ultrasound or MRI examination are not enough to predict the treatment outcome of breast cancer,and they are often depend on the subjective judgement of doctors.Breast cancer is a malignant tumor mediated by multiple genes and multiple steps,the different gene expression profiles will lead to different biological characteristics of breast cancer.The intrinsic gene expression has a great impact for the prognosis of patients.Therefore,it is of great significance to study the pathogenesis of breast cancer by molecular biology.By predicting the risk of recurrence in patients with breast cancer,high-risk cancer patients can benefit from adjuvant therapy,while low-risk cancer patients can be protected from unnecessary treatment.The microarray data of ER+ breast cancer and ER-breast cancer were analyzed in this paper.After data preprocessing,univariate Cox proportional hazards regression mode was used to preliminary screening the genes,The risk score for each patient was calculated by the screening gene,then we classified the patients.Then the LASSO method was further used to screen the genes and applied the genes to the survival tree method for prediction and classification,Kaplan-meier curve and log-rank test show that two groups of patients with low-risk gene signature and high-risk gene signature have a significant difference in the relapse-free survival.The testing set were used to prove the validity of the result.Finally,a classification model of ER+ and ER-breast cancer was established.First,we used univariate Logistic regression to screen the genes preliminarily,and put the selected genes into the random forest model.We used the testing set to validate the model,and the correct rate of classification reached 93.75%.This model has a good prediction effect on the classification of ER+ breast cancer and ER-breast cancer recurrence risk,and 5 genes related to ER+ and ER-recurrence were identified respectively.At the same time,the selected genes have been reported in the relevant literature,indicating that it is closely related to the occurrence and development of breast cancer.Other genes need further experiments to verify the role they play in breast cancer in order to treat the patients.In the classification of ER+ and ER-breast cancer patients,the correct rate is also higher,which proves the validity of the model we use.
Keywords/Search Tags:Cox proportional hazards regression model, LASSO, survival tree, Logistic regression, random forest
PDF Full Text Request
Related items