Font Size: a A A

Study Of Dna Sequence Motif Identification Algorithm

Posted on:2015-07-08Degree:MasterType:Thesis
Country:ChinaCandidate:R M SuFull Text:PDF
GTID:2180330473954633Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Bioinformatics is a science that uses information technology to solve biological problems, which faces many challenges: gene discovery, revealing the structure of the genes and proteins, and so on. An important challenge is to identify the transcription factor binding sites(TFBSs) of the DNA sequences, which regulates the expression of genes through the activation or inhibition of transcription mechanism. These binding sites are short DNA fragments, referred to as the motifs.Given a collection of a DNA sequences, motif search problem is to detect the over-expressed motifs to provide good candidate set for TFBSs. So far, many algorithms have been proposed without the development of interactive software tools. Therefore, in order to provide new tool in this area, this article primarily design an algorithm that can effectively identify DNA motifs, and develop a new motif search software system.1. This paper proposes a new motif recognition algorithm based on consensus. First, divide the whole problem into a number of mutually independent sub-problems. Second, evaluate each sub-problem and classify them to two categories, the sub-problems to be computed and that not to be computed. Finally, solve each sub-problem by searching the motif tree structure with branch and bound.2. This paper designs and implements a software system based on the proposed algorithm. First, analyze the functional requirements and performance requirements of the system for motif search. Then, divide the whole system into five modules: the user interface, input, output, recognition algorithm and overall control. Finally, design details of each module.3. The software system is tested on both function and performance. The software can find motifs under various forms of input and various parameter settings. Moreover, our algorithm is compared with two other well-known algorithms on time performance and computational accuracy, and the results show that our algorithm has better performance in the identification of long motifs.
Keywords/Search Tags:Bioinformatics, motif finding, software system, DNA
PDF Full Text Request
Related items