Font Size: a A A

The Comparison Of Biological Sequences For Phylogenetic Tree Based On2-step Markov Model

Posted on:2014-10-15Degree:MasterType:Thesis
Country:ChinaCandidate:J XuFull Text:PDF
GTID:2250330425483279Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
In the20th century, with the completion of the precise Human Genome Project’s sequence diagram, the number of bases in the gene pool is substantial growing in recent years, in order to manage these data properly and dig out the useful biological information, as the same time to analyze it, many biological scientists, mathematicians, computer scientists come to research in this emerging field, bioinformatics named computational molecular bioinformatics is the interdisciplinary which generated in the course of the study. The analysis and comparison of Biological sequence is one of most important content of computational molecular bioinformatics, with the development of researching in recent years, comparative analysis of biological sequences are generally divided into two categories:a comparison method, another non-alignment method. Due to the current huge amount of sequence data, too large computation, time-consuming, high cost of the comparison method, researchers divert focus on non-alignment method. In this paper, the2-step Markov model as the research object, and come up with the non-alignment model of DNA sequence, the main contents are summarized as follows:In the second chapter, the degree of similarity of DNA sequences can be concluded according to the comparison of DNA sequences, which helps to speculate their relationship in respect of the structure, function and evolution. In this paper, we introduce the fundamental of the weighted relative entropy based on2-step Markov Model to compare DNA sequences. The DNA sequence, consisted of four characters A,T,C,G, can be considered as a Markov chain. By taking state space I={A,T,G,T} and describe the DNA sequences with2-step transition probability matrix we can get the eigenvalue of the DNA sequence to define the similarity metric. Therefore, we find a new method to compare the DNA sequences, which is used to classify chromosomes DNA sequences obtained from30species. The phylogenetic tree built by the alignment-free method of the distance matrix resulted from the weighted relative entropy has clearer and more accurate division.In the third chapter, a new geometry representation model of the DNA sequence mapping into a three-dimensional curve, and turn curve into numeric valued expression, then extract the eigenvalue. Reconstruct the evolutionary tree of NA fragment from the8virus cRNA sequence according to distance matrix, and analyze their similarities.
Keywords/Search Tags:2-step Markov model, the classification of DNA sequence, relative weightedentropy, geometry representation model, phylogenetic tree
PDF Full Text Request
Related items