Font Size: a A A

Study On The Algorithm Of Motif Discovery In Biological Network

Posted on:2019-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:L DingFull Text:PDF
GTID:2370330545473995Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
A large number of high-throughput experiments have produced PB-level omics data.These omics data contain a wealth of information on the role of biomolecules.How to extract valuable information from these omics data is a major challenge of today's compu-tational biology.In order to study the regulation mechanism among biomolecules,the commonly used research method is to abstract the interaction relationship between biomolecules into a reg-ulation network,and then use the method of data mining based on graph theory to mine valuable information from the biomolecule interaction network.Motif structure is a kind of subgraph structure which is considered to contain latent bioregulation mechanism.Mining co-regulatory motifs in a co-regulatory network is of great significance for studying the reg-ulatory mechanism of biomolecules in a co-regulatory network.However,due to the fact that the co-regulatory network is larger than the single-molecule regulatory network such as protein-protein interaction networks and gene-regulatory networks,and the size of the co-regulatory network is larger and the types of nodes are more.The previous discovery algorithms are difficult to efficiently handle this type of network.Therefore,more efficient co-regulatory network motif discovery algorithms need to be studied.The work and contri-butions are as follows:1)In order to improve the efficiency of the co-regulatory network motif discovery algorithm,we apply the G-trie structure to a co-regulation network motif discovery algo-rithm,storing a variety of co-regulatory network motif types in a prefix tree structure and reusing the finding process to improve the efficiency of the sub-graph statistics.Through multithreading technology,we have implemented the parallelization of this algorithm and further improved the efficiency of the co-regulatory network motif discovery algorithm.In order to find a more large-scale co-regulatory network model type,we designed a method for sampling and generating candidate subgraphs.Through this method,we can find up to 8 nodes in the co-regulatory network motif type.In addition,based on the example of co-regulating the network motif structure in the co-regulation network,we discovered the clustering characteristics of the co-regulatory network motif.2)Although the method of generating candidate subgraphs by sampling can find large-scale motif types,it is difficult to find all the motif types in the co-regulation network.Finding all the types of motif in the co-regulation network is an NP-hard problem,and the computational volume will increase exponentially with the increase in the size of the motifs.For this purpose,we design a co-regulation network motif discovery algorithm based on the MapReduce computation model.In this algorithm,we solve the problem of iterative depen-dency problem in the previous motif discovery algorithm and the problem of MapReduce computation model which is difficult to accurately count the frequency of each subgraph in the network graph.The problem of insufficient rates of CPU is solved by the multi-thread parallel method.Based on the MapReduce calculation model,the co-regulation network motif discovery algorithm realizes the fusion and efficient use of computer resources,and greatly shortens the time for searching all the motif types in the co-regulation network.
Keywords/Search Tags:Motif, Co-regulatory Network, G-trie Structure, Parallel, Distributed Computing
PDF Full Text Request
Related items