Font Size: a A A

Similarity Analysis Of DNA Sequences Based On Graphical Representation

Posted on:2017-08-02Degree:MasterType:Thesis
Country:ChinaCandidate:K WangFull Text:PDF
GTID:2310330482496077Subject:Mathematics
Abstract/Summary:PDF Full Text Request
With the completion of human genome project and full implementation of model organisms genome project, it brings a large number of sequence data. The focus of biological research transits from acquisition and accumulation of data to interpretation and analysis of data. In this context, bioinformatics has emerged, whose main purpose is to extract biological information by analyzing sequence data. Bioinformatics is a cross-discipline subject, which considers the integrated use of mathematics, biology, computer science and information science to mine and abstract biological rules lurked in biological sequences. Designing efficient graphical representation and analyzing similarity of biological sequences is a popular topic of bioinformatics.In this paper, we develop new graphical representation methods of DNA sequences and make similarity analysis of DNA sequence based on the graphical representation. The main achievements are summarized as below:1) This paper proposes a new and effective DNA graphical representation method, called B-curve. We give the detailed process of the method and represent sequence similarity by the Euclidean distance of 24D vectors whose components are extracted from the graphics.2) To present the effectiveness of B-curve, we make a similarity matrix and phylogenetic tree for the coding sequences of the first exon of ?-globin gene of 11 different species based on B-curve. To further illustrate the utility of the method, we apply the method to mitochondrial gene sequences of 45 species. The results of two data sets are consistent with the biological evolution.3) We apply the 2D graphical representation to two data sets of influenza A virus and make similarity analysis. Moreover, our approach is compared with 6 other methods based on sequence similarity analysis, and the results are good enough to illustrate the utility and the superiority of B-curve method.
Keywords/Search Tags:DNA, Graphical representation, Similarity matrix, Sequence analysis, Influenza virus
PDF Full Text Request
Related items