Font Size: a A A

Cancer Prediction And Identification Of Related Tumor Markers

Posted on:2024-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:S Q HuFull Text:PDF
GTID:2544306911493764Subject:Statistics
Abstract/Summary:PDF Full Text Request
Cancer is a malignant disease that is associated with gene changes and it has become the greatest threat to human life due to its high incidence and high mortality rates.The emergence of epigenetics has provided a new perspective in the field of cancer diagnosis.Further research has revealed that DNA methylation,a chemical modification process that can regulate gene expression without changing DNA sequence,is closely linked to the formation and progression of tumors.Hypermethylation reduces the expression level of transcriptional genes and causes transcriptional abnormalities,while hypomethylation deactivates tumor suppressor genes and leads to cell proliferation,culminating in the development of malignant tumors.The regulation of gene expression is crucial for the development of organisms.During the formation and progression of tumors,silenced genes begin to be highly expressed,while normal genes are downregulated.The presence of abnormally expressed genes induces the production of tumors.In addition,microorganisms are distributed in various parts of the human body,participating in and regulating various biological functions,while abnormal microbial expression can increase the risk of cancer.With the development of high-throughput sequencing technology,multiomics data has been accumulating,providing new opportunities for mining the potential relevance of each omics data to cancer.(1)An accurate model for diagnosing cancer and identifying its type has been established using DNA methylation data.Firstly,a combination of coefficient of variation and Elastic Net was used for feature dimensionality reduction,resulting in the selection of69-218 and 2554 highly correlated effective Cp Gs from single-and multi-class datasets,respectively.The function analysis of genes corresponding to Cp Gs produced 289 cancer candidate gene markers;Secondly,the SMOTE algorithm and Focal Loss function were used to address the problem of imbalanced sample distribution in cancer datasets.Next,the fully connected neural network was trained using the selected Cp Gs as a cancer diagnosis and cancer type recognition model(i Cancer-Pred).Following 5-fold crossvalidation and independent test validation,the model displayed excellent prediction performance for various cancer types,with an overall classification accuracy of 97%.This is superior to other advanced models in machine learning.(2)Built an online cancer type recognition web server called "i Cancer-Pred".By inputting a patient’s DNA methylation data into the platform,the server will output the probability of the patient having cancer and the diagnosis of the cancer type.(3)Establishing a lung cancer staging model based on multi-omics data.First,differential expression analysis was performed on gene expression and microbiome data to identify differentially expressed genes and microbial species.Attention mechanisms were then incorporated to construct a deep neural network,using these differential features to train the staging model.Finally,the model was evaluated using a 5-fold cross-validation,with the validation results demonstrating an accuracy of over 80%.This paper delves deeply into the relationship between DNA methylation and various cancer types,constructing a model with the superior diagnostic performance and a userfriendly diagnostic platform.Moreover,it explores the relationship between gene expression,microorganisms,and the development stage of lung cancer,culminating in a staged model with improved prediction accuracy.Additionally,this work aptly demonstrates the generalization capabilities of the proposed cancer prediction framework.
Keywords/Search Tags:Multi-Omics Data, Tumor Markers, Feature Selection, Elastic Net, Prediction Model, Stage Model
PDF Full Text Request
Related items