Font Size: a A A

Protein Functional Module Identification Based On Adaptive Graph Convolution Network

Posted on:2022-11-19Degree:MasterType:Thesis
Country:ChinaCandidate:H W ChenFull Text:PDF
GTID:2480306773471594Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Most Protein-Protein Interaction Network(PPI)data are produced by the rapid development of high-throughput experimental techniques.Interactions refer to the correlation of proteins at the molecular level,which is being studied from the different aspects of biochemistry,signal transduction,and genetic networks.In all life activities,indispensable protein interactions are the basis for the metabolic activities carried out by cells that receive exogenous or endogenous signals to regulate the expression of their genes and maintain their biological characteristics through the unique signal transduction pathways.Identifying protein functions from protein-protein interaction networks provides biologists an opportunity to efficiently understand the cellular organizations and functions.Protein functional modules are expressed jointly by multiple basic protein nodes,which cooperate to express unique functional properties in the protein-protein interaction network,either in the form of protein complexes or protein pathways.Existing data mining methods try to combine various biological information to improve the quality of predicting protein complexes and pathways.However,it's a great challenge for fusing different biological information into a unified computational model to mine different protein functional modules.Graph neural networks widely implemented in life science research recently,such as mining protein-protein interactions,the discovery of multi-omics disease markers,and drug-disease associations.The attribute graph embedding methods effectively fuse node attributes and the graph structure information in non-euclidean space to learn more robust node representations and achieve high-quality information mining.In proteinprotein interaction networks,protein complexes are assembled by multiple tightly connected basic protein nodes,while the basic protein nodes of protein pathways are indirectly linked to each other forming mutual higher order neighbors.Most of the methods only focus on the identification of protein complexes,ignoring the basic proteins combined into protein pathways also exist in the protein-protein interaction networks and only learning protein node information from neighbors by the shallow graph neural network model for mining protein complexes in the dense subgraph,that can not observe the higher-order protein node information,making it difficult to identify the protein pathway information.Meanwhile,oversmoothing is unavoidable in deep graph neural networks.The strategy of the fixed convolutional layer with shallow layers leads to undersmoothness of the node representation in traditional methods,or roughly stacking multi-layer graph neural network produces indistinguishable representations of nodes,both of which will drop the performance of downstream tasks.Each node with different densities in a subgraph has various neighborhood conditions,and the information to be learned is often different.Therefore,we propose a novel method called Nodewise Adaptive Smoothness-transition Graph Convolution(NASGC)to use adaptive orders of graph convolution in multiple situations and better characterize the nodes.Smoothness is an indicator for assessing the degree of similarity of feature representations among nearby nodes.The proposed smoothness sensor senses the smoothness of a graph after the graph filtering signal of neighbors and adaptively terminates the current convolution once the smoothness is saturated to prevent oversmoothing.It simultaneously focuses on both protein complexes in dense subgraphs and protein pathways with sparse one by NASGC model in protein functions identification.Protein node representations with different density structures combine with the information on the gene ontology properties.Finally,mining all maximal cliques in the whole graph identifies different protein complexes and protein pathways by node similarity.Experimental results show that NASGC performs better than the competing algorithms in different protein-protein interaction network datasets.
Keywords/Search Tags:Protein functional module identification, Signal transduction pathway, Deep graph neural network, Smoothness of graph signals, Attributed network
PDF Full Text Request
Related items