Font Size: a A A

Research On Genomic Data Analysis And Survival Stage Prediction Of Breast Cancer

Posted on:2022-11-03Degree:MasterType:Thesis
Country:ChinaCandidate:W XingFull Text:PDF
GTID:2504306758980269Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As a malignant disease with high morbidity and mortality rate,breast cancer is a serious health hazard for women.Among Asian women susceptible to malignant tumors,breast cancer has the highest incidence rate.The number of breast cancer cases and deaths are increasing every year,with the increase in population size,aging population and the spread of high-risk factors including tobacco,obesity and infectious diseases.In recent years,with the continued efforts and advances in modern medicine,new diagnostic strategies and methods have been widely used and breast cancer has been better diagnosed,and the mortality rate from breast cancer has declined.However,the high morbidity and mortality of breast cancer remains a serious problem that cannot be ignored,and it remains particularly important to screen for potential biomarkers related to the occurrence,development and prognosis of breast cancer,as well as to reveal the physiological and pathological processes associated with breast cancer.At the same time,given the high lethality of breast cancer,it is necessary to predict the survival of cancer patients with relative accuracy,so as to provide more precise care and treatment for patients and improve the quality for cancer patients’ survival.As important substances that transmit genetic information,genes control the expression of genetic traits,participate in all processes of human metabolism,and regulate the metabolic activity of the body.While traditional studies and diagnostics only reflect the external symptoms of patients,the study of gene expression status of cancer patients can reveal the function and biological pathways of related genes.Therefore,the selection of key genes with strong association with breast cancer for prediction of breast cancer development factors can reduce the risk of cancer progression and metastasis,and in turn,by predicting the survival of cancer patients,more effective personalized treatment can be provided to patients,thus reducing breast cancer mortality and improving long-term survival.In this paper,we first analyzed genomic data of breast cancer patients for specific gene markers in order to investigate the pathogenesis of breast cancer and to identify key candidates for diagnosis and treatment.In view of the "high dimensional,small sample size and noise" characteristics of gene expression data,differentially expressed gene analysis was selected for initial filtering of cancer genes,and support vector machine-recursive feature elimination(SVM-RFE)was selected for further feature screening.Then,key genes were selected by protein interaction network(PPI)and survival analysis was performed on key genes to initially detect potential breast cancer biomarkers.Next,a novel pathway feature extraction model,ADS,is proposed for feature extraction of multiple pathways involving a large number of genes,which fuses multiple genes involved in a pathway into a new feature and uses this new feature to represent the pathway.Then,the new feature is passed through a neural network(NN)to score the classification contribution of the feature by shapley process,which leads to more meaningful pathway biological pathways.Finally,based on the set of candidate gene markers,this paper conducts a prediction study on the survival of breast cancer patients.A novel fusion prediction model,FSNX,is proposed,which combines the advantages of classification models such as random forest(RF),support vector machine(SVM)and neural network(NN),and uses the prediction probability values of multiple prediction models as new features to further improve patient survival prediction by XGBoost method.After experimental testing,the FSNX method outperformed the prediction effect of individual prediction models and multiple latest models,with a 3-year survival prediction accuracy of 86.81%.Meanwhile,this paper developed a corresponding application for the FSNX method to assist doctors in predicting the survival of breast cancer patients and help them provide more effective personalized treatment plans for patients.
Keywords/Search Tags:breast cancer, gene expression data, cancer biomarkers, biological pathway, survival stage prediction
PDF Full Text Request
Related items