Font Size: a A A

Multi-omics Data Integration Via Network Fusion Model Through The High Order Path Similarity

Posted on:2020-10-31Degree:MasterType:Thesis
Country:ChinaCandidate:A D XuFull Text:PDF
GTID:2370330590960616Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Recent advances in high-throughput sequencing have accelerated the accumulation of omics data on the same tumor tissue from multiple sources,including genomics,epigenetics,transcriptome and so on.Intensive study of multi-omics integration on tumor samples can accomplish structured observation and description of diseases,especially tumors,at multiple molecular levels.Thus,it can stimulate progress in precision medicine and is promising in detecting potential biomarkers.However,current methods are restricted owing to highly unbalanced dimensions of omics data and the high noise generated by the measurement and quantization of biological information data.It is difficult to accurately assess the relevance and importance of each data source.In this paper,we proposed an omics data integration method,named high-order path elucidated similarity(HOPES).HOPES fuses the similarities derived from various omics data sources to solve the dimensional discrepancy,and progressively elucidate the similarities from each type of omics data into an integrated similarity with various high-order connected paths.Through a series of incremental constraints for commonality,HOPES can take both specificities of single data and consistency between different data types into consideration.The fused similarity matrix can give global insight into patients' correlation,and make up for missing information and errors in a single data source and efficiently distinguish subgroups.Moreover,The consensus clustering algorithm based on spectral clustering and the feature selection method based on l1 regularization regression are adopted in this paper to categorize samples and trace the global similarity matrix back to the original omics features.We tested the performance of HOPES on both a simulated dataset and five empirical tumor datasets.Our method was shown to achieve superior accuracy and high robustness compared with several benchmark methods on simulated data.Further experiments on five cancer datasets demonstrated that HOPES achieved superior performances in cancer classification.The stratified subgroups were shown to have statistically significant differences in survival.We further located and identified the key genes,methylation sites,and microRNAs within each subgroup.They were shown to achieve high potential prognostic value and were enriched in many cancer-related biological processes or pathways.In conclusion,the HOPES method proposed in this paper combines the information from multiple omics sources together,accurately and stably realizes the learning of the global structure,and has an excellent performance in the cancer-related clinical tasks.It is not only expected to achieve the accurate molecular typing of cancer,but also provide a new idea for the detection of potential biomarkers.
Keywords/Search Tags:Omics data, High order similarity, Network fusion, Convex optimization
PDF Full Text Request
Related items