Font Size: a A A

Research On Inferring The Disease Trajectory Of Cancer Patients Via Representation Learning

Posted on:2024-05-10Degree:MasterType:Thesis
Country:ChinaCandidate:H Y ZhangFull Text:PDF
GTID:2544307064486124Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Fine-grained cancer progression status is essential for understanding the mechanisms of disease progression and taking effective treatments,while available clinical TNM(Tumor node metastasis)staging information of cancer patients is coarsegrained and lacks continuous status labels for cancer samples from early to advanced stages,making it inapplicable to dynamic methods for discovering the critical events or states of disease progression.So far,there has been limited progress in modeling the dynamics of cancer progression based on cancer bulk RNA-seq data from real patients.Therefore,there arises a need to develop a method to generate fine-grained disease progression labels.In this study,we propose a novel deep representation learning framework informed by existing knowledge related to cancer progression called CPDRI to predict the finegrained disease progression status of cancer patients from TNM staging and transcriptomic data.Firstly,we jointly apply HSIC Lasso and differential gene expression analysis to identify cancer progression-related genes,and most of these cancer progression-related genes have been studied for the mechanisms of cancer progression.Based on the similarity of patients and the order of clinical stage information,a patients’ directed graph was constructed to coarsely establish the relationship between samples.Next,we generated two types of embedding representations for each cancer patient case,an embedding-based representation of the patients’ directed graph,derived from a knowledge-formed graph autoencoder model by learning the patient’s directed graph topology,and a representation-based embedding of patients’ gene expression,derived from a metric learning model based on residual convolutional neural network by learning the patient’s cancer development-related genes.By combining these two embedding representations and optimizing two loss functions,we can predict the order of disease progression for any approximate case beyond the existing TNM staging,and obtain a new reconstructed directed graph of disease progression.On this basis,considering the possibility of loops in this graph,which would seriously mislead the relation judgment of developmental paths,we proposed a new score for each link and then developed a link filter to eliminate loops by removing some links to obtain a directed acyclic graph.The paths in the directed acyclic graph can all provide temporal information,and we eventually extracted the longest path to analyze the disease progression status of cancer patients.We validated the results at both the coarse-grained and fine-grained levels.Applied to LUAD and BRCA cancer samples in the TCGA database,CPDRI performed excellently in predicting disease progression of approximate cases under various evaluation metrics such as precision,recall,f1-score,spearman.The inferred cancer patient disease development path is highly consistent with the TNM staging,and based on the longest paths on the CPDRI reconstructed disease progression digraph of cancer patients,the dynamics of disease progression consistent with TNM staging can also be identified by state-of-the-art single-cell trajectory inference methods.Based on the developmental path inferred by CPDRI,we found that good results can be achieved by establishing a time series prediction model,that is the model can predict future states in ten steps forward with a root mean square error of about 0.8,which confirms that the disease progression pathway of cancer patients exists not only in agreement with TNM staging,but also well reconstructs the dynamic process of disease progression.Finally,for practical exploration,we also predicted the critical tipping points of cancer development based on the inferred pathway of CPDRI and identified marker genes in the critical tipping period of cancer development.In terms of TNM staging,the emergence of some critical states coincides with the general perception of cancer progression,such as the emergence of pre-metastatic cancer critical points.At a fine-grained level,it is possible to identify a number of critical points that emerge at the transitional stage of cancer staging in a sample of individuals,and the signal genes that characterize these individual-level critical points have been shown to be associated with cancer progression.Overall,CPDRI is expected to serve as a foundational tool for enhancing the temporal resolution of cancer staging data,capture the dynamics of disease progression,and provide insights into the underlying mechanisms,and thus help people design effective intervention strategies for cancer progression.
Keywords/Search Tags:cancer development, temporal inference, graph neural networks, representation learning, graph pruning
PDF Full Text Request
Related items