Font Size: a A A

Multi-omics Data Integration Analysis Method And System Based On Deep Clustering Model And Traditional Model

Posted on:2022-10-22Degree:MasterType:Thesis
Country:ChinaCandidate:L L WuFull Text:PDF
GTID:2494306569981729Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the development of science and technology,a large amount of multiomics data has also emerged.These integrated omics data can more comprehensively observe and diagnose diseases from multiple molecular levels,so as to carry out a comprehensive molecular classification of patients and contribute to the development of precision medicine.There are many methods for studying multi-omics data.However,these integration methods are usually presented in the form of packages.The methods are relatively fixed and single and only target certain specific types of data,and cannot be selected by users.At the same time,they rarely use dimensionality reduction and clustering to combine multiple omics data,which makes the effect of clustering unsatisfactory.These problems are a huge challenge for researchers with multiple omics data research and no programming experience.In order to solve these problems,this paper proposes a multi-omics data integration analysis method based on deep clustering model and traditional model,and develops a system,referred to as IOAT.It provides many methods for non-programmers to study high-dimensional multiomics data.Here,IOAT provides users with diversified data preprocessing,feature screening,clustering and survival analysis and other related functions.This system divides the process into two categories according to whether the multi-omics data has a priori tags: the first type is for labeled multi-omics data.This paper proposes a joint deep clustering method based on standardized Euclidean distance(DKM+).This method embeds the clustering method into the dimensionality reduction model for joint optimization,and uses the standardized Euclidean distance method to calculate the correlation distance in the loss function of the dimensionality reduction and the clustering model according to the characteristics of the multi-omics data,so that the joint The optimized model obtains better clustering results; the second category provides single-factor multi-factor feature selection methods for unlabeled multi-omics features.Each method can be freely combined.After features are reduced in dimension,users can also perform risk assessment on these features and predict survival probabilities in different time periods.Next,the user can choose to set the cluster number by himself or the system to cluster the selected features,and use the number of clusters with good survival analysis results as the subtype classification result.The system has been fully verified on multiple real cancer project data and simulation data sets of TCGA.In summary,the method and system developed in this article can provide diversified data analysis methods for different types of multi-omics data.The results of its feature selection can provide medical staff and biologists with genes closely related to tumor staging,which can be used as a reference for the connection between omics and clinical phenotypes,thereby helping to establish personalized cancer treatment plans.The clustering results can provide them with a reference for a specific cancer molecular subtype,thereby helping patients to carry out precise treatment.
Keywords/Search Tags:Multi-omics data, Joint deep clustering, Traditional clustering method, Feature selection, Cancer subtype, System
PDF Full Text Request
Related items