Font Size: a A A

Study Of Gastric Cancer Prognosis And Protein-lncRNA Interaction Prediction Based On Machine Learning

Posted on:2024-02-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:H ZhangFull Text:PDF
GTID:1524307316464774Subject:Biomedical statistics
Abstract/Summary:PDF Full Text Request
Gastric cancer is a highly heterogeneous and invasive malignant tumor with a hidden nature,high mortality rate,and serious harm to human health.Prognostic research is crucial for providing precision medicine to patients.An efficient prognostic model can accurately predict a patient’s risk of death and provide a tool for formulating reasonable treatment plans.However,there is currently a lack of such risk models,and building an efficient prognostic model has become a challenging problem.Previous gastric cancer prognosis studies primarily focused on using clinical features and pathological information to establish prognostic models.However,these models had limited predictive effects as they did not consider the biological characteristics of gastric cancer,including the genetics and epigenetics of tumors.Long noncoding RNA(lncRNA)is currently an important research area in epigenetics,and it exhibits tissue-specific and disease-specific characteristics.Therefore,the rapid and effective identification of lncRNA biomarkers from a large number of genes can not only aid in the diagnosis and treatment of gastric cancer but also contribute to understanding the specific changes in lncRNA and its regulatory mechanisms during the development of gastric cancer.To this end,this paper proposes a machine learning-based prediction study on the prognosis of gastric cancer and protein-lncRNA interaction.The specific research contents are as follows:First,a semi-supervised gastric cancer prognosis model based on path topological structure networks,called BPTSS,is proposed.The BPTSS model demonstrates superior performance compared to other models in identifying gastric cancer subtypes.To address the heterogeneity among gastric cancer patients,a method is designed to extract the intrinsic correlation between cancer patients using the path topology network algorithm for predicting prognosis.This method involves constructing a similarity matrix between gastric cancer patients using the Gaussian interaction attribute kernel,determining the correlation between patients using weighted difference information entropy,and integrating these components into the gastric cancer prognosis model through an exponential decay function.Experimental results show that the BPTSS model achieves higher Area Under the Curve values of 0.954,0.967,and 0.943 in the training set,validation set,and external validation set,respectively.These values surpass those of the Back Propagation neural network(0.905,0.894,0.903)and competitive neural network(0.858,0.938,0.926).Furthermore,accuracy,precision,and F1 value indicators are employed to evaluate the performance of the BPTSS model.The importance of prognostic features is visualized using the exponential decay function.Overall,the BPTSS model exhibits higher predictive performance and interpretability,making it a promising tool for clinical practice.Second,exploring the prognostic impact of adjuvant chemotherapy in patients with advanced gastric cancer from a competitive perspective and constructing a cancer-specific mortality(CSD)nomogram.In view of the characteristics of competing risks in the death of cancer patients,the Fine-Gray competing risk model was used to evaluate the survival status of patients after receiving adjuvant chemotherapy,while considering other possible competing risks.The research results show that the C-index,3-year AUC valuesand 5-year AUC values of train set/test set were 0.809/0.807,0.750/0.729,and 0.772/0.749,respectively.The analysis of the calibration curve of the nomogram also showed that the predicted probability of 3-year CSD and the predicted probability of 5-year CSD in the training set and validation set were consistent with the actual probability.It shows that the Fine-Gray competitive risk model can effectively identify the prognostic factors of patients with advanced gastric cancer adjuvant chemotherapy,and the nomogram constructed based on these factors shows good predictive performance,which is a convenient and personalized prognostic prediction tool with great potential.Thirdly,machine learning algorithm was used to screen oxidative stress-related lncRNA gastric cancer prognostic molecular markers and build a prognostic model.To explore the role of oxidative stress-related lncRNAs in cancer prognosis,the RNA-seq data of gastric cancer patients in the TCGA database and the corresponding clinical data were used as data sets,and 8 oxidative stress-related lncRNAs were obtained through random survival forest algorithm and Cox regression analysis.lncRNAs(OSRLs)and use q RTPCR experiments to identify lncRNAs.When patients were stratified according to OSRLs characteristics,the overall survival of patients in the high-risk group was shorter than that in the low-risk group,and the receiver operating characteristic curve further verified the accuracy of OSRLs characteristics.A nomogram combining OSRLs features and clinicopathological variables was subsequently constructed,which showed good performance for prognostic stratification of gastric cancer patients.In the following immune analysis,gene function enrichment analysis and drug sensitivity analysis,there were significant differences between high-risk group patients and low-risk group patients.These results provide new insights into the pathogenesis of gastric cancer,which will help provide new ideas and new targets for individualized treatment,and promote the precision of clinical treatment.Finally,in order to explore the biological functions of lncRNA,a path-based lncRNAprotein interaction prediction model was proposed for the first time.The research content is as follows.By integrating protein semantic similarity,lncRNA functional similarity,known human lncRNA-protein interactions and Gaussian interaction attribute kernel similarity,a new path-based lncRNA-protein interaction(Path-Based Lnc RNA-Protein Interaction,PBLPI)prediction model.The PBLPI model utilizes three interrelated subgraphs to construct a heterogeneous graph and infer potential lncRNA-protein interactions through a depth-first search algorithm.Therefore,PBLPI achieves reliable performance in a 5-fold cross-validation framework with an average AUC value of 0.9244 and an area under the precision curve value of 0.6478.It is expected that PBLPI will become a useful tool for identifying potential lncRNA-protein interactions,helping to discover new cancer targets and therapeutic strategies in the future.In summary,this paper completed the following work:(1)Construct an interpretable prognostic model of gastric cancer based on the path topology network;(2)Analyze the prognostic impact of adjuvant chemotherapy on patients with advanced gastric cancer from the perspective of competition and construct a nomogram prediction model;(3)Use machine learning to identify prognostic biomarkers of gastric cancer and perform prognostic analysis and immune microenvironment analysis;(4)Propose a path-based prediction of lncRNA and protein interaction relationship,providing useful information for further exploration of lncRNA function in the future.
Keywords/Search Tags:Machine learning, Gastric cancer prognosis model, Biomarker, Survival analysis
PDF Full Text Request
Related items