Font Size: a A A

Research Of Overlapping Complexes Detection Algorithm And Its Related Disease Association In Protein Networks

Posted on:2022-03-22Degree:MasterType:Thesis
Country:ChinaCandidate:L XueFull Text:PDF
GTID:2480306527984789Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
With the development of bioinformatics,various types of biological data have emerged.The mining of various types of data through efficient algorithms has become a research hotspot,which has derived research directions such as genomics,proteomics,and systems biology.The use of computational methods to extract biological modules in protein networks and construct disease systems is conducive to low-cost research on protein targeting relationships and disease mechanisms.Mature community detection in complex networks provides a theoretical basis for the functional modules of protein networks,and protein networks with overlapping attributes can be studied based on the idea of local fitness.The mining and analysis of functional modules based on protein overlapping complexes is an important content of bioinformatics.This article mainly studies the detection of protein complexes and the analysis of overlapping structures in biological networks,including community networks,protein interaction networks,and disease-related networks,respectively.The research contents are as follows:(1)In this paper,we propose an improving algorithm,that is,the local expansion method based on the centered clique(CLEM),for detecting overlapping communities.There are two inherent shortcomings that the parameter dependency and instability by using the traditional node clustering and link clustering to detect overlapping communities.Firstly,in algorithm CLEM,we select the centered cliques as the core seed and introduce the weights to punish some nodes deleted by multiple times in the process of seed expansion,so its stability is improved.Then,by selecting the fitness function with parameter-independent and improving its iterative calculation process,the parameter limitation of the fitness function is avoided and the Computational complexity is quickly reduced.Finally,we test our algorithm on synthetic networks and real-world networks,and show that CLEM is good both in computing time and accuracy compared with some existing algorithms.(2)To solve the problems of high clustering cost and large network noise in large protein networks,this paper proposes a noise reduction method based on local expansion(LENRM)to mine protein functional modules.Particularly,we can calculate the overlapping protein complexes from large protein-protein interaction(PPI)networks.By removing inter-modules interactions and simulating undiscovered connections,our method eliminates high amount of false positive and false negative noise in PPI network respectively.Our approach is tested on several benchmark datasets and compared with nine known algorithms,which indicates the superior performance of it both in complexes' quantity and accuracy.(3)In order to effectively integrate different types of biological data to explore the mechanism of disease,this paper proposes a new framework,including protein-protein network,disease gene association and disease complex pair,to cluster protein complexes and infer disease association.In fact,a single disease is usually caused through multiple genes products such as protein complexes rather than single gene.Therefore,it is meaningful for us to discover protein communities from the protein-protein interaction network and use them for inferring disease-disease associations.In this article,we propose a new framework including protein-protein networks,disease-gene associations and disease-complex pairs to cluster protein complexes and infer disease associations.Complexes discovered by our approach is superior in quality(Sn,PPV and ACC)and clustering quantity than other four popular methods on three PPI networks.A systematic analysis shows that disease pairs sharing more protein complexes(such as Glucose and Lipid Metabolic Disorders)are more similar and overlapping proteins may have different roles in different diseases.These findings can provide clinical scholars and medical practitioners with new ideas on disease identification and treatment.From general complex networks to specific biological networks,this topic explores how to cluster communities from them,especially the extraction of overlapping protein complexes in biological networks.Then,a research framework covering genes,proteins,phenotypes and diseases is constructed based on the association of protein complexes and gene-disease to predict the association of unknown diseases,forming a complete system structure.
Keywords/Search Tags:community detection, protein interaction network, overlapping protein complexes, disease association, functional modules
PDF Full Text Request
Related items