Font Size: a A A

The Research For Recognition Of Transcription Factor Binding Sites Based On Genetic-Neural Network

Posted on:2010-04-25Degree:MasterType:Thesis
Country:ChinaCandidate:P HuangFull Text:PDF
GTID:2120360275989526Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Understanding gene transcription regulation has become one of the central research problems in bioinformatics. Identifying transcription factor binding sites(TFBS) is not only the key to understanding the functions of transcription factor, but also the important step to understanding of the transcription regulation, and also the foundation of gene regulatory networks'construction. Transcription factors are proteins that bind to DNA, typically upstream from and close to the transcription start site of a gene, and regulate the expression of the gene by activating or inhibiting the transcription machinery. Now, more and more methods and software are used to identify TFBS. However, the prediction accuracy of these algorithms is still quite low, so the methods of identifying TFBS need to be further improved and strengthened.As a powerful tool for pattern recognition, artificial neural network (ANN) has good nonlinear approximation and robustness, and has been widely and successfully applied to sequence analysis. BP neural network uses gradient descent method as its learning rules. So, it is easy to be trapped into local optimum and leads to an unsatisfactory performance. Genetic algorithm (GA) with reliable global search capability, it does not rely on gradient information, only by simulating the process of natural evolution to search the optimal solution. With the merits of the genetic algorithm and the gradient descent algorithm, a mixed option algorithm (GA-BP) to train the neural networks is put forward. By GA-BP algorithm, BP networks can obtain better initial weights. Last, we use GA-BP algorithm to solve problems that how to identify transcription factor binding site. In this paper, the experimental data are yielded by MetInspector algorithm and consensus mode. First, it yields a lot of nucleic acid sequence by consensus mode. Then, we calculates the score of all of nucleic acid sequence and selects the nucleic acid sequences that scores are larger than Threshold as experimental samples. By this way, we can obtain more data that closer to the real experimental data to make up for lacking of data.The GA-BP algorithm in this paper is implemented by MATLA. We construct 5 GA-BP networks for 5 sets of data. Moreover, we compare our method with BP algorithm and genetic algorithm, as a result, it illuminates that our new model takes on favorable capability.
Keywords/Search Tags:Transcription factor binding sites (TFBS), BP neural network, Genetic algorithm (GA), Consensus model
PDF Full Text Request
Related items