Font Size: a A A

Some Novel Graphical Representations Of Biological Sequences And Their Application

Posted on:2009-11-01Degree:MasterType:Thesis
Country:ChinaCandidate:Z G YuFull Text:PDF
GTID:2120360245487704Subject:Operational Research and Cybernetics
Abstract/Summary:PDF Full Text Request
Bioinformatics is a new interdiscipline, it is generated accompanied with the research of genome sequencing projects of human. It's mainly studying some complex calculational problems related to gene and protein sequences in molecular biology. In general, bioinformatics is engaged in obtaining, processing, storing, allocating and explaining biological information related to the research of genome sequencing projects. This definition includes two meanings: the first is the collecting, arranging and serving of the massive data; the other is to find some new rules from these data.Sequence alignment problem has an important status in Bioinformatics, because it's the basis of many other more complicated problems. Now the method of applying scoring functions has been comparative mature, but some chemical properties and chemical structures have been ignored in this method, and it's casual of the selection of the scoring function, which determines the scores of the sequences'similarities. For RNA, the other deficiency of the method is that it is not adaptive for longer RNA secondary structures and the RNA secondary structures with pseudoknots.In recent years, the graphical representations of biological sequences have played more and more important roles in the research of local alignment and global alignment of biological sequences, and the relative similarity analysis based on number characters makes the intuitive visual perceptual more reasonable. In this paper, the author outlines some novel graphical representations of DNA sequences and RNA secondary structures, and describes its application in biological sequences'similarity analysis.The main innovations of this paper are listed as follows:In chapter 2, the author first proposes a novel 2-D graphical representation of DNA primary sequences based on four characteristic curves, and then introduces the way of constructing 4-component characteristic vectors for similarity analysis. The examination of similarities/disimilarities among the coding sequences of the first exon ofβ-globin gene of different species illustrates the utility of the approach. Because of some deficiencies of 2-D graphical presentation itself, the author improvers it and proposes a novel 8-D graphical presentation, then constructs a new 5-D characteristic vector for the research of similarity analysis.In chapter 3, the author proposes a novel 2-D graphical representation of RNA secondary structures based on their chemical properties and chemical structures. The 2-D graphical representation is nonsingular, not limited by pseudoknots. Then a simple number characteristic vector is introduced and the examination of similarities/dissimilarities among the secondary structure at the 30-terminus belonging to different species illustrates the utility of the approach.
Keywords/Search Tags:DNA sequence, RNA secondary structure, graphical representation, L/L matrix, normal leading eigenvalue, similarity analysis
PDF Full Text Request
Related items