Font Size: a A A

Research And Implementation On Carcinogenic Gene Module Identification Method Of Glioblastoma Based On Integrating Multi-Omics Data

Posted on:2024-08-01Degree:MasterType:Thesis
Country:ChinaCandidate:B T LiangFull Text:PDF
GTID:2544307121965029Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of high-throughput sequencing technology,massive amounts of multi-omics data provide a valuable data resource for carcinogenic gene module detection studies.In addition,the "modularity of disease theory" has led to a broad interest in methods for detecting carcinogenic gene modules based on integrating multi-omics data and network models.This study focuses on two crucial factors contributing to cancer development,namely,dysregulated gene expression and gene mutation.We integrate multi-omics data to construct biological networks and identify carcinogenic gene modules based on network models.This will help to systematically unravel the pathogenic associations between biomolecules across the genomes,providing essential insights for cancer prevention,diagnosis and treatment.In view of this,the main research of this paper as follows:(1)Research on carcinogenic gene module identification method using co-regulatory networks.Gene expression dysregulation is essential in cancer initiation and progression,and transcriptional regulatory networks are vital for investigating gene expression dysregulation.However,existing regulatory network-based algorithms for detecting carcinogenic gene modules have limitations in capturing the synergistic pathogenesis of competing endogenous RNA(ce RNA)molecules.This study proposes a method for detecting carcinogenic gene modules based on co-regulatory networks of ce RNA molecules.First,the method constructs a ce RNA regulatory network by integrating transcriptomics and non-coding RNA regulatory data.Then,it integrates proteomics data to build a dysregulation network of critical oncogenes.Next,the study proposes the SOM-NSM algorithm,a module identification algorithm based on the self-organizing map(SOM)model,which integrates the neighbor similarity of the dysregulated network and the co-regulatory relationship of non-coding RNAs into the node attribute.To mine the optimal module structure,the SOM-NSM algorithm performs heuristic optimization training based on the SOM model and the module degree index Q.Finally,the study ranks the modules using structural entropy metrics and biofunctional enrichment analysis to identify highly correlated carcinogenic gene modules.Applying the algorithm to glioblastoma multiforme(GBM)identified a critical carcinogenic gene module.The biological analysis demonstrated that this module and its subce RNA network were significantly associated with GBM,validating the method’s effectiveness.Compared with 11 comparative methods,the SOM-NSM algorithm showed superiority by not requiring the number of modules to be designed in advance and being insensitive to parameters.The experimental results indicate that this method can more effectively and accurately detect carcinogenic gene modules and identify ce RNA modules with higher biological relevance,which is of great theoretical and practical significance to studying GBM pathogenesis.(2)Research on carcinogenic gene module identification method based on network representation learning.Gene mutations are a critical factor in cancer development and progression.However,current studies often integrate gene mutation data with other histological data into a single network model,ignoring the correlation and complementarity between different histological data.herefore,this study adopts a serial integration strategy of "genome-transcriptome-proteome" to systematically analyze multi-omics data.Firstly,the study screens key oncogenes based on the genomic and transcriptomic data of glioblastoma multiforme(GBM)and uses gene expression data and proteomic data to construct a biological network of key oncogenes.Then,the study focuses on the importance of higher-order similarity in module-identifying studies and proposes Multi Sim Ne Nc,a module-identifying algorithm based on network representation learning.This algorithm overcomes the limitation that most module-identifying algorithms require the number of modules in advance and is applicable to multiple network types.In this study,the performance of the Multi Sim Ne Nc algorithm is evaluated on co-expression network,dysregulated network,and benchmark network datasets.The algorithm outperforms 11 comparative algorithms in the contour coefficient SC metric in the co-expression network performance evaluation.In the dysregulated network performance evaluation,the results show that the algorithm has an overall optimal performance compared to 17 comparative algorithms.On the benchmark network dataset,Multi Sim Ne Nc shows the best module identification performance.The study also analyses the importance of integrating multi-step neighborhood structure information.The results show that progressively expanding the K-step neighborhood information provides a more accurate representation of the network structure,leading to better module detection results.Finally,the study performs a comprehensive biofunctional analysis of the detected carcinogenic gene modules and screens important carcinogenic gene modules.The analysis results indicate that these carcinogenic gene modules are closely related to the occurrence and development of GBM,providing a theoretical basis for accurate diagnosis and targeted therapy of GBM.Additionally,the application of Multi Sim Ne Nc in biological networks shows superior performance,providing a new and practical method in the field of module recognition.(3)Design and development of a module identification website.The rapid growth of bio-network data and the development of module detecting algorithms have led to the emergence of many module-identifying algorithms.However,biologists often face challenges in using these algorithms due to the requirement for programming skills.To better promote the module-identifying algorithms presented in this paper,we used a standardized system development process for software engineering to design and implement an online module recognition website.The website provides online access to the relevant algorithms proposed in this paper and offers tools for online gene enrichment analysis.The website interface is simple and clear,and the results are visualized in various forms,allowing users to complete module identification tasks according to their needs and save them locally.With this platform,the threshold for biologists to use the modular recognition algorithm is reduced and the application is made easier.Based on the research presented,this study provides insights at the genetic level for cancer prevention,diagnosis,and treatment,holding significant theoretical and practical value.Future research can further optimize and expand these methods to accommodate the study of a wider variety of cancers and other diseases,providing additional support for biomedical research.
Keywords/Search Tags:Multi-omics data, Carcinogenic gene modules, Biological network, Module identifying algorithm, Network clustering
PDF Full Text Request
Related items